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Statistics  are  proposed  for  testing  the  null  hypothesis 
of  bivariate  symmetry  with  censored  matched  pairs.   The  two 
types  of  alternatives  considered  are  (1)  the  marginal 
distributions  have  a  common  location  parameter  (either  known 
or  unknown)  and  differ  only  in  their  scale  parameters  and 
(2)  the  marginal  distributions  differ  in  their  locations 
and/or  scales.   For  the  first  alternative,  two  types  of 
statistics  are  proposed.   The  first  is  a  statistic  based  on 
Kendall's  tau  modified  for  censored  data,  while  the  second 
type  is  a  class  of  statistics  consisting  of  linear 
combinations  of  two  statistics.   Conditional  on  N,,  the 
number  of  pairs  in  which  both  members  are  uncensored,  and 
N~  ,  the  number  of  pairs  in  which  exactly  one  member  is 
censored,  the  two  statistics  used  in  the  linear  combination 

are  independent  and  each  has  a  null  distribution  equivalent 

vi 


to  that  of  a  Wilcoxon  signed  rank  statistic.   Thus,  any 

member  in  the  class  can  be  used  to  provide  an  exact  test 

which  is  distribution-free  for  the  null  hypothesis.   The 

statistic  based  on  Kendall's  tau  is  not  distribution-free 

for  small  sample  sizes  and  thus,  a  permutation  test  based  on 

the  statistic  is  recommended  in  these  cases.   For  large 

samples,  a  modified  version  of  the  Kendall's  tau  statistic 

is  shown  to  be  asymptotically  distribution-free. 

For  the  second  and  more  general  alternative,  a  small 

sample  permutation  test  is  proposed  based  on  the  quadratic 

form  Wn  =  T  ^  E    T  ,  where  T  ^  is  a  2-vector  of  statistics 

composed  of  a  statistic  designed  to  detect  location 

differences  and  a  statistic  designed  to  detect  scale 

differences  and  I  is  the  va riance-co variance  matrix  for 

T  .   For  large  samples,  a  distribution-free  approximation 

for  T"  I    T   is  recommended. 
~  n  T    ~  n 

Monte  Carlo  results  are  presented  which  compare  the  two 
types  of  statistics  for  detecting  alternative  (1),  for 
sample  sizes  of  25  and  40.   Quadratic  form  statistics  W 
using  different  scale  statistic  components  are  also  compared 
in  a  simulation  study  for  samples  of  size  35.   For  the 
alternative  involving  scale  differences  only,  the  statistic 
based  on  Kendall's  tau  performed  best  overall  but  requires  a 
computer  to  do  the  calculations  for  moderate  sample  sizes. 
For  the  more  general  alternative  of  location  and/or  scale 
differences,  the  quadratic  form  using  the  scale  statistic 
based  on  Kendall's  tau  performed  the  best  overall. 


CHAPTER  ONE 
INTRODUCTION 


Let  W,  and  W2  denote  random  variables;  then  the 
property  of  bivariate  symmetry  can  be  defined  as  the 
property  such  that  (W,,W,)  has  the  same  distribution  as 
(W2»Wj).   This  property  of  bivariate  symmetry  is  also 
referred  to  as  exchangeability  (or  bivariate 
exchangeability).   Commonly,  this  property  arises  as  the 
null  hypothesis  in  settings  in  which  a  researcher  has  paired 
observations,  such  as,  when  the  subjects  or  sampling  units 
function  both  as  the  treatment  group  and  the  control  group 
or  possibly  the  researcher  has  matched  the  subjects 
according  to  some  criteria  such  as  age  and  sex. 

For  example,  a  dentist  may  want  to  assess  the 
effectiveness  of  a  dentifrice  in  reducing  dental 
sensitivity.   The  dentist  randomly  selects  n  patients  and 
schedules  two  appointments  for  each  patient  at  three  month 
intervals.   During  the  first  visit,  a  hygienist  assesses  the 
patient's  dental  sensitivity  after  which  the  patient  is 
given  the  dentifrice  by  the  dentist.   At  the  end  of  the 
three  month  usage  period,  the  patient  returns  and  his  or  her 
dental  sensitivity  is  again  assessed.   If  X,.  and  X~  .  are 
the  first  and  second  sensitivity  measurements,  respectively, 


of  the  i    patient,  the  dentist  has  n  bivariate  pairs  in  the 
sample.   If  there  is  no  treatment  effect,  then  effectively 
the  two  observations  of  dental  sensitivity  are  two 
measurements  of  exactly  the  same  characteristic  at  two 
randomly  chosen  points  of  time.   In  which  case,  the 
distribution  of  (^j-jX^^)  "*"s  t*ie  same  as  that  of  (•^i'^li^' 
and  so  a  test  using  the  null  hypothesis  of  bivariate 
symmetry  would  be  appropriate. 

The  possible  alternatives  for  a  test  which  uses  a  null 
hypothesis  of  bivariate  symmetry  are  numerous.   The  three 
types  of  alternatives  which  will  be  considered  in  this  work 
are  the  following: 

1)  The  marginal  distributions  have  a  common 
known  location  parameter  and  differ  only  in 
their  scale  parameters. 

2)  The  marginal  distributions  have  a  common 
unknown  location  parameter  and  differ  only  in 
their  scale  parameters. 

3)  The  marginal  distributions  differ  in  their 
location  and/or  scale  parameters. 


The  situation  under  consideration  in  this  work  is 
further  complicated  by  the  possibility  of  censoring. 
Censoring  occurs  whenever  the  measurement  of  interest  is  not 


observable  due  to  a  variety  of  possible  reasons.   The  most 
common  situation  is  when  the  measurement  is  the  time  to 
"failure"  (i.e.,  death,  the  time  until  a  drug  becomes 
effective,  the  length  of  time  a  drug  remains  effective, 
etc.)  for  an  experimental  unit  subjected  to  a  specific 
treatment.   If  at  the  end  of  the  experiment,  the 
experimental  unit  still  has  not  "failed,"  then  the 
corresponding  time  to  "failure"  (referred  to  as  survival 
time)  is  censored.   All  that  is  known,  is  that  the  survival 
time  is  longer  than  the  observation  time  for  that  unit  and 
thus  has  been  right  censored.   An  example  of  censoring  in 
bivariate  pairs  could  be  the  times  to  failure  of  the  left 
and  right  kidneys  or  the  times  to  cancer  detection  in  the 
left  and  right  breasts  (Miller,  1981). 

Many  different  types  of  right  censoring  exist  (Type  I, 
Type  II  and  random  right  censoring),  each  determined  by 
restrictions  placed  on  the  experiment.   Type  I  censoring 
occurs  if  the  observation  time  for  each  experimental  unit  is 
preassigned  some  fixed  length  T.   Thus,  if  the  survival  time 
for  a  unit  is  larger  than  T,  it  is  right  censored.   Type  II 
censoring  occurs  when  the  experiment  is  designed  to  be 
terminated  as  soon  as  the  r    (r<n  and  fixed)  failure 
occurs.   Random  right  censoring  is  a  generalization  of  Type 
I  censoring,  in  which  the  experimental  units  each  have  their 
own  length  of  observation  (which  are  not  necessarily  the 
same).   This  would  occur,  for  example,  if  the  length  of  the 
experiment  was  fixed  but  random  entry  into  the  experiment 


was  allowed.   It  is  this  latter  type  of  censoring  which  this 
work  addresses. 

Now  we  statistically  formulate  the  problem  of 

i    i 
interest.   Let  (X.   X^^  for  i=l,2,...,n  denote  a  random 

sample  of  bivariate  pairs  which  are  independent  and 

identically  distributed  (i.i.d.)  and  C.  i=l,2,...,n  denote  a 

random  sample  of  censoring  times  which  are  i.i.d.,  such  that 

C.  denotes  the  value  of  the  censoring  variables  associated 

with  pair  (X, ^ ,X,. ).   In  the  case  of  random  right  censoring, 

the  observed  sample  consists  of  (X, i>^2±'^±^     where 

i  i 

X,.  =  min(X,.,C.),  X^j  =  min(X2:;,Ci)  and  6-  is  a  random 

variable  which  indicates  what  type  of  censoring  occurred, 

i  Description 

1  xIi<c±.x2i<ci 

3  Xii>Ci>X2i<Ci 

4  XU>Ci'X2i>Ci 

Now  we  state  a  set  of  assumptions  which  are  referred  to 
later  . 

Assumptions : 

t     1 
Al.   (xii»X2i^  i=l,2,...,n  are  i.i.d.  as  the 

t     1 
bivariate  random  variable  (X.-.X-,). 


A2 .   (X, 1 ,X-, )  has  an  absolutely  continuous  bivariati 


11 'A21 
distribution  function   F( 


x,  -  p 


x„  -  p. 


where  F(u,v)  =  F(v,u)  for  every  (u,v)  in  R  .   The 
parameters  y,  (pj)  and  :,   (o^)  a r^  location  and 
scale  paratmeters,  respectively.   They  are  not 
necessarily  the  mean  and  standard  deviation  of  the 
marginal  distributions. 


A3.   C,  ,Ci C   are  i.i.d.  continuous  random 

variables,  with  continuous  distribution  function 
G(c). 


A4  .   The  censoring  random  variable  C.  ,  is  independent 

1     1 
of  (^ii>^2i^  i=l,2,...,n  and  the  value  of  C.  is 

the  same  for  both  members  of  a  given  pair. 

A5.   P(xJi>C.,X2i>Ci)  <  1. 

A6.   G(F   (  V2  ))  <  1  where  Fy   denotes  the  marginal 
i  i 

cumulative  distribution  function  (c.d.f.)  of  X . , 

1-1,2. 


Note  that  under  A5 ,  the  probability  is  positive  that  the 
sample  will  contain  observations  that  are  not  doubly 
censored . 

With  this  notation,  the  null  and  alternative  hypotheses 
can  now  be  formally  stated.   The  null  hypothesis  is 
HQ  :  u^=V2>  al=a2  versus  tne  alternatives: 


1.    The  case  where  y,=u2=U  with  \i    known, 
H0  :   a  i  j*  a  -, 


2.    The  case  where  p,=p,=p  with  \s    unknown, 


1"M2" 


Ha:   0l  +    a2 


3.    H  :  \i  ,    ^    ^2  and/or  a,  ^  On* 


Chapter  Two  and  Three  will  present  test  statistics  for 
alternatives  1)  and  2).   Chapter  Four  will  present  a  test 
for  the  more  general  alternative  stated  in  3).   Monte  Carlo 
results  and  conclusions  will  be  presented  in  Chapter  Five. 
First  though,  we  describe  related  work  in  the  literature. 

Since  this  dissertation  combines  two  areas  of  previous 
development,  that  is,  bivariate  symmetry  and  censoring,  the 
first  part  of  the  review  will  deal  with  related  works  in 
bivariate  symmetry  without  a  censoring  random  variable 
considered.   The  second  part  of  the  review  will  mention 
related  works  for  censored  matched  pairs. 

The  first  four  articles  to  be  considered,  Sen  (1967), 
Bell  and  Haller  (1969),  Hollander  (1971)  and  Kepner  (1979), 
all  suggest  tests  directed  towards  specific  alternatives  to 
the  null  hypothesis  of  bivariate  symmetry.   The  work  of 
Kepner  (1979)  more  directly  influenced  the  development  of 
this  thesis  than  the  others,  but  they  were  direct  influences 
on  the  work  of  Kepner  and  thus  will  be  mentioned. 

Sen's  article  (1967)  dealt  with  the  construction  of 


conditionally  distribution-free  nonparamet ric  tests  for  the 
null  hypothesis  of  bivariate  symmetry  versus  alternatives 
that  the  marginal  distributions  differed  only  in  location, 
or  that  the  marginal  distribution  differed  only  in  scale,  or 
that  the  marginal  distributions  differed  in  both  location 
and  scale.   The  basic  idea  behind  his  tests  is  the 

I        T 

following.   Under  H  ,  the  pairs  (X]i>X2i)  i=l,2,...,n  are  a 
random  sample  from  an  exchangeable  continuous 

distribution.   He  pools  all  the  elements  into  one  sample  (of 
size  N=2n),  ignoring  the  fact  the  original  observations  were 
bivariate  pairs  and  then  ranks  this  combined  sample.  From 
this,  Sen  obtains  what  he  refers  to  as  the  rank  matrix,  R., 


Rll   R12 


'In 


21    22  2n 


where  R . .  is  the  rank  of  X-.  in  the  pooled  sample  1=1,2 
i=l,2,...,n.   Let  S(R^)  be  the  set  of  all  rank  matrices  that 
can  be  obtained  from  RN  by  permuting  within  the  same  column 
of  R^  for  one  or  more  columns.   Under  H  ,  each  of  the  2n 
elements  of  S(RN)  is  equally  likely  and  thus,  if  T   is  a 
statistic  with  a  probability  distribution  (given  S(RN)  and 
HQ )  which  depends  only  on  the  2n  equally  likely  permutations 
of  RN,  Tn  is  conditionally  distribution-free  (conditional  on 
the  given  RN  and  thus  S(Rj  observed).   Sen's  statistic  T 
can  be  defined  as 


t    =  -    y    e 

n    n  • =i   N>R 


li 


where  E^  i  is  a  score  function  based  on  N=2n  and  i  alone. 

For  the  test  of  location  differences  only,  Sen  suggests 

using  the  Wilcoxon  scores  (E.,  .  =   — — -  )  or  the  quantile  F 

w , x      N+l 

scores  (EN  ^  =   F   (   ■-)   where  F  is  an  appropriately  chosen 

absolutely  continuous  c.d.f.).   The  Ansar i-B radley  scores 
N+l     I .     N+l 


(E 


(E 


N,i 


N,i 


,}     |x     ?  1^   or  the  Mood  Scores 
(  TTTT  ~  y  )  )   are  suggested  for  use  when  the 


alternative  is  that  the  marginal  distributions  differ  only 
in  their  scale  parameters.   For  the  more  general 
alternative,  that  the  marginal  distributions  differ  in 
location  and  scale,  he  recommends  making  a  vector  (of  size 
2)  of  his  statistics  where  one  component  is  one  of  the 
statistics  for  differences  in  location  and  the  other  for 
scale. 

One  basic  weakness  of  Sen's  proposals,  as  mentioned  by 
Kepner  (1979),  is  that  the  procedure  basically  ignores  the 

correlation  structure  within  the  original  observations 

t    i 
^Xli'X2i^  i=lj2,...,n  and,  thus,  suggests  that  a  better  test 

could  possibly  be  constructed  by  exploiting  the  natural 

pairing  of  the  observations. 

The  test  proposed  by  Bell  and  Haller  (1969)  does 

exploit  this  natural  pairing  of  the  observations.   They 

suggest  both  parametric  and  nonparamet r i c  tests  for 

bivariate  symmetry.   In  the  normal  case,  they  form  the 

likelihood  ratio  test  for  the  transformed  observations 

(Y1±,Y2i)  where  Y1±=  Xj.  -  X2±    and  Y2±=  Xj.  +  X2±.       The 


resulting  test  they  suggest  when  dealing  with  a  bivariate 
normal  distribution  is  to  reject  H   if  |b,|  >    t(3,;n-2)  or 
|b2|  >    tC^J11"1)  where 


Bl  = 


(n-2)l72  r(YlSY2) 
(l-r^Y^Y^)^ 


n^Y 


and   B 


and  r(Y^,Y2)  ^s     c^e  sample  correlation  coefficient  of  the 

—  9 

Y^'s  and  ^^'s,  Y   and  S   are  the  sample  mean  and  unbiased 
sample  variance,  respectively  of  the  Y,  ■' s    and  t(B;n) 
represents  the  critical  value  for  a  t  distribution  with  n 
degrees  of  freedom  which  cuts  off  $  area  in  the  right 
tail.   The  main  problem  with  this  test,  as  Kepner  (1979) 
also  states,  is  that  the  overall  level  of  the  test,  a,  is 

a  =  20j  +  232  -  401&2 
so  relatively  small  values  for  0,  and  B2  would  need  to  be 
chosen . 

The  nonparame t r i c  tests  they  suggest  are  either 
complicated,  due  to  many  estimation  problems  involved,  or 
have  low  power  or  are  just  unappealing  due  to  the  fact  the 
test  is  somewhat  researcher  dependent.   (That  is  different 
researchers  working  independently  with  the  same  data  could 
reach  different  conclusions.)   Thus,  they  will  not  be 
ment  ioned . 

Hollander  (1971)  introduced  a  nonparame t ri c  test  for 
the  null  hypothesis  of  bivariate  symmetry  which  is  generally 
appealing  and  consistent  against  a  wide  class  of 
alternatives.   He  suggested 


D  = 


10 


F  (x,y)  -  F  (y,x)}  dF  (x,y) 


where 


l   n 
F  (x,y)  =  ±-      I 


i  =  l 


I(-»,x](Xli)I(-ro,y]^2i) 


is  the  bivariate  empirical  c .  d  .  f  .   He  notes  that  nD   is  not 

r  n 

distribution-free  nor  asymptotically  distribution-free  when 
HQ  is  true,  and  thus  proposed  a  conditional  test  in  which 
the  conditioning  process  is  based  on  the  2n  data  points 


Cjj)  (Jn) 

(xn,x21)      ... .,(xln,x2n) 

for  k  =  1 ,2 , . . . ,n} 


jk=  0  or  1 


which  are  equally  likely  under  H  .   Here  we  let 
(s,t)(0)  =  (s,t)  and  (s,t)(l)  =  (t,s).   This  statistic 
performs  well  even  for  extremely  small  sample  sizes  (n=5) 
with  one  major  drawback  as  mentioned  by  Hollander  which  is 

the  computer  time  which  it  takes  to  evaluate  nD  .   It 

n 

becomes  very  prohibitive  for  even  moderate  n.   Koziol  (1979) 
developed  the  critical  values  for  nD   for  large  sample 
sizes,  which  work  much  better  than  the  large  sample  critical 
value  approximations  originally  suggested  by  Hollander. 

Kepner  (1979)  proposed  tests  based  on  the  transformed 
observations  (Yii>Y2i^  of  Bel1  and  Haller  for  the  null 
hypothesis  of  bivariate  symmetry  versus  the  alternatives 
that  the  marginal  distributions  differ  in  scale  or  that  the 
marginal  distributions  differ  in  location  and/or  scale.   For 
the  alternative  of  differences  in  scale,  he  proposed  a  test 


statistic,  it  , 


11 


n\   rLf{(YU"  Yli)(Y2j"  Y2i} 


i<j 


where 


¥(t)  = 


which  is  Kendall's  Tau  applied  to  the  transformed 
observations.   He  noted  that  tt   is  neither  distribution-free 
nor  asymptotically  distribution-free  in  this  setting  and 
thus  recommended  a  permutation  test  which  is  conditionally 
distribution-free  based  on  tt   for  small  samples.   This 
permutation  test  was  based  on  conditioning  on  what  he  called 
the  collection  matrix,  C  , 


C  = 


'   I  Iy      I 
11  I    I  12  I 


21 


22 


In 


2n 


He  noted  that  under  HQ  and  conditional  on  C  ,  there  are  2n 
equally  likely  transformed  samples  ( 'f  .  ,  |  Y   |  ,  Y   )  possible, 
each  being  determined  by  a  different  collection  of  f.   's 
where  V^       =  {1  or  -1}.   For  larger  samples,  he  obtains  the 
asymptotic  distribution  which  can  be  used  to  approximate  the 
permutation  test. 

One  nice  property  of  the  statistic  tt  ,  which  Kepner 
notes,  is  that  tt   is  insensitive  to  unequal  marginal 
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locations  and  thus  location  differences  do  not  influence  the 
performance  of  the  test. 

For  the  more  general  alternative  of  location  and/or 
scale  differences,  a  small  sample  permutation  test  for 
bivariate  symmetry  was  proposed  based  on  the  quadratic  form 


Vn  =  Tn  tn  Tn 


where 


T   = 


-   (W+  -  n(n+l)/4 


}/o 


n'2  Un  -  V2) 

W"T   is  the  Wilcoxon  signed  rank  test  statistic  calculated  on 
the  Y, -'s  and  tt   is  as  previously  defined.   Again,  the 
conditioning  of  the  test  is  on  the  collection  matrix  C  .   He 
obtains  the  limiting  distribution  of  the  small  sample 
permutation  test  and  proposes  a  large  sample 
distribution-free  approximation  which  is  computationally 
efficient . 

The  second  collection  of  articles  which  will  be 
mentioned  deals  with  the  topic  of  censored  matched  pairs. 
Much  work  has  been  done  recently  in  the  area  of  censored 
data,  but  the  work  of  Woolson  and  Lachenbruch  (1980)  and 
Popovich  (1983)  most  directly  influence  the  results  in  this 
thesis  and  thus  will  be  described  here. 

Woolson  and  Lachenbruch  (1980)  considered  the  problem 
of  testing  for  differences  in  location  using  censored 
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matched  pair  data.   The  situation  they  considered  is 
identical  to  the  situation  developed  in  this  thesis  if  one 
assumes  equality  of  the  scale  parameters.   They  utilized  the 
concept  of  the  generalized  rank,  vector  introduced  by 
Kalbfleisch  and  Prentice  (1973)  to  develop  tests  by 
imitating  the  derivation  of  the  locally  most  powerful  (LMP) 
rank  test  in  the  uncensored  case.   Although  they  imitate  the 
development  of  LMP  rank  tests  for  the  uncensored  case,  it  is 
unclear  whether  these  tests  are  LMP  in  the  censored  case. 
Scores  for  the  test  are  derived  for  (1)  if  the  underlying 
distribution  the  differences  (i.e.,  X,.  -  X-^)  is  logistic 
and  (2)  if  the  underlying  distribution  for  the  differences 
is  double  exponential.   In  each  case  the  statistic  developed 
reduces  to  usual  statistic  (Wilcoxon  signed  rank  statistic 
and  sign  test  statistic  for  an  underlying  logistic  density 
or  double  exponential  density,  respectively)  when  no 
censoring  is  present.   Asymptotic  results  for  the  tests  are 
derived  based  on  the  number  of  censored  and  uncensored 
observations  tending  to  infinity  simultaneously. 

Popovich  (1983)  proposed  a  class  of  statistics  for  the 
problem  of  testing  for  differences  in  location  using 
censored  matched  pair  data.   The  class  consists  of  linear 
combinations  of  two  statistics  which  are  independent  given 
Nj  and  N2  where  N,  is  the  number  of  pairs  in  which  both 
members  are  uncensored  and  N2 ,  the  number  of  pairs  in  which 
exactly  one  member  is  censored.   The  class  of  statistics  can 
be  expressed  in  the  general  form  of 
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T  (N.  ,N.)  =  (1-L  )72  T*  (N.  )  +  L72  T*  (N„) 
n   1   2  n     lnl       n2n2 


where  T.   is  the  standardized  Wilcoxon  signed  rank  statistic 

calculated  on  the  N,  uncensored  pairs,  and 

_  1/ 
=  N-,   '2  (Nop-Njr  )  where  N~R  is  the  number  of  pairs  for 


2n 


which  X,  .  is  censored  and  X^.  is  not,  and  N2t  is  the  number 
of  pairs  for  which  X2i  is  censored  and  X,  .  is  not  (note 

^2R+N2L=  N2^*   ^e  weignt  L   is  a  function  of  N,  and  N2  only 

P  * 

such  that  0<Ln<l  and  Ln  >  L  .   Note  that  T,   is  a 

distribution-free  statistic  calculated  only  on  the 

uncensored  pairs  (and  is  a  common  statistic  used  for  testing 

for  location  in  the  uncensored  case)  while  T~   is  a 

statistic  based  only  on  the  type  2  and  3  pairs  (as 

previously  defined  in  this  introduction).   The  statistic 

T2n  is  designed  to  detect  whether  type  2  pairs  are  occurring 

more  often  (or  less  often)  than  should  be  under  the  null 

hypothesis.   Under  H  ,  T2   is  a  standardized  Binomial  random 

variable  with  parameters  N2=n2  and  p=  Vo  and  thus 

distribution-free.   Popovich  obtains  asymptotic  normality 

for  the  statistic  Tn(Ni>N2)  under  the  conditions  (1)  that  N, 

and  N2  tend  to  infinity  simultaneously  and  (2)  under  a  more 

general  condition  as  n  tends  to  infinity.   In  a  Monte  Carlo 

study,  he  compares  five  statistics  from  this  class  to  the 

test  statistic  of  Woolson  and  Lachenbruch  (TWL)  (1980)  based 

on  logistic  scores.   The  results  show  that  these  statistics 

perform  as  well  as  T,,,  (better  in  some  cases)  and  that  they 
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are  computationally  much  easier  to  calculate.   Furthermore, 
exact  tables  can  be  generated  for  any  member  of  the  class 
proposed  by  Popovich. 

With  the  background  established  for  the  research  in 
this  thesis,  the  attention  will  now  be  focused  toward  the 
development  of  the  test  statistics  to  be  investigated 
here.   Chapter  Two  will  present  a  statistic  for  testing  for 
differences  in  scale  which  can  be  viewed  as  an  extension  of 
Kepner's  t\       for  censored  data.   In  Chapter  Three,  another 
statistic  will  be  presented  for  the  same  alternative  but 
more  in  the  spirit  of  the  work  proposed  by  Popovich,  that 
is,  the  linear  combination  of  two  statistics  which  are 
conditionally  independent  (conditioned  on  the  number  of  type 
1  and  (type  2  +  type  3)  pairs  observed).   For  the  more 
general  alternative  (i.e.,  differences  in  location  and/or 
scale),  Chapter  Four  will  present  a  statistic(s)  which  is  a 
vector  of  two  statistics  (one  for  scale  and  one  for 
location)  following  the  work  of  Kepner.   Lastly,  Chapter 
Five  will  present  a  Monte  Carlo  study  of  the  statistics 
developed  in  this  dissertation. 


CHAPTER  TWO 
A  STATISTIC  FOR  TESTING  FOR  DIFFERENCES  IN  SCALE 

2.1   Introduction 


In  this  chapter  a  statistic  will  be  presented  for 
testing  the  null  hypothesis  of  bivariate  symmetry  in  the 
presence  of  random  right  censoring.   Figure  1  represents  a 
possible  contour  of  an  absolutely  continuous  distribution  of 
this  form.   The  alternative  hypothesis  for  which  this  test 
statistic  is  developed  is  H  :   a,  t    oo!  i.e.,  the  marginal 
distributions  differ  in  their  scale  parameters.   The 
marginal  distributions  are  assumed  to  have  the  same  location 
parameter.   Figure  2  represents  a  possible  contour  of  an 
absolutely  continuous  distribution  of  this  form. 

The  basic  idea  for  this  statistic  was  introduced  in  a 
dissertation  by  Kepner  (1979).   He  suggested  the  use  of 
Kendall's  tau  on  an  orthogonal  transformation  of  the 
original  random  variables  to  test  for  differences  in  scale 
in  the  marginal  distributions.   The  presence  of  a  censoring 
random  variable  was  not  included.   To  extend  this  idea  to 
include  the  presence  of  random  right  censoring,  the  concept 
of  concordance  and  discordance  in  the  presence  of  censoring 

which  was  used  by  Oakes  (1982)  was  applied. 
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Figure  1.   Contour  of  an  Absolutely  Continuous  Distribution 
That  Has  Equal  Marginal  Locations  and  Equal 
Marginal  Scales. 
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Figure  2.   Contour  of  an  Absolutely  Continuous  Distribution 
That  Has  Equal  Marginal  Locations  and  Unequal 
Marginal  Scales. 
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Section  2.2  will  present  the  test  statistic  and  the 
notation  necessary  for  its  presentation.   A  small  sample 
test  will  be  discussed  in  Section  2.3.   Section  2.4  will 
investigate  the  asymptotic  properties  of  the  test  statistic, 
with  comments  on  the  statistic  following  in  Section  2.5. 


2.2   The  CD  Statistic 


In  this  section,  the  test  statistic  will  be  presented 
which  is  designed  to  test  whether  the  marginal  distributions 
differ  in  their  scale  parameters.   First,  since  the  work  is 
so  related,  the  test  statistic  which  Kepner  (1979)  proposed 
to  test  for  unequal  marginal  scales  will  be  presented.   This 
will  give  the  reader  an  understanding  of  the  motivation  for 
the  test  statistic. 

Let  (XM>X2i)  f°r  i=l>2,...,n  denote  independent 

identically  distributed  (i.i.d.)  bivariate  random  variables 

i     i 
which  are  distributed  as  (X,,,X2i).   Consider  the  following 

i    i 
orthogonal  transformation  of  the  random  variables  (X,.,X~.): 

let 


li 


=  X.  .  +  X0.   and   Y0 ,  =  X,  .  -  X 


li    ~2i 
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'li 


±    for  1-1,2, 


,  n  . 


Figure  3  illustrates  what  happens  to  the  contour  given  in 
Figure  1  (i.e.,  the  contour  of  an  absolutely  continuous 
distribution  under  H  )  when  this  transformation  is 


2  0 
applied.   Figure  4  shows  what  happens  to  the  contour  given 
in  Figure  2  (i.e.,  under  H  )  when  this  transformation  is 

applied.   Note,  as  can  be  seen  in  Figure  3,  under  this 

i         i 
transformation  and  H  ,  Y, ,  and  Y0 ,  are  not  correlated 
o '   11        2  1 

i         i 
although  X,,  and  X2  i  possibly  were.   Similarly,  as  can  be 

i 

seen  in  Figure  4,  under  this  transformation  and  H  ,  Y,,  and 

Y2]  are  correlated  (negatively  in  this  case).   Thus,  the 
original  problem  of  testing  for  unequal  marginal  scales  has 
been  transformed  into  the  problem  of  testing  for  correlation 
between  Y,,  and  Y2,  •   Kepner  (1979)  suggested  the  use  of 
Kendall's  tau  to  test  for  correlation  between  Y, .  and  Y2 i • 
Kendall's  tau  was  chosen,  due  to  the  fact  it  is  a 
U-statistic  and,  thus,  the  many  established  results  for 
U-statistics  could  be  applied. 

The   test  statistic  which  will  be  presented  in  this 
section  is  very  similar  to  the  above  mentioned  statistic. 

However,  when  censoring  is  present,  the  true  observed  value 

»        ?  it 

of  X,.  or  Xj,   (or  both)  is  not  known,  and  thus  Y.,  or  Y2, 

(or  both)  are  also  affected.   To  take  this  into  account,  a 

modified  Kendall's  tau  will  be  used  which  was  presented  by 

Oakes  (1982)  to  test  for  independence  in  the  presence  of 

censoring.   First  though,  some  additional  notation  must  be 

int  roduced . 

i     i 
Recall,  (X, . ,X«.)  denotes  bivariate  random  variables 

t     i 

which  are  distributed  as  (X..,X2,).   Let  C,,C2,...,C   denote 

the  censoring  random  variables  which  are  independent  and 
identically  distributed  (i.i.d.)  with  continuous 
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Figure  3.   Contour  of  an  Absolutely  Continuous  Distribution 
That  Has  Equal  Marginal  Locations  and  Equal 
Marginal  Scales  under  the  Transformation. 
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Figure  4.   Contour  of  an  Absolutely  Continuous  Distribution 
That  Has  Equal  Marginal  Locations  and  Unequal 
Marginal  Scales  under  the  Transformation. 
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distribution  function  G(c)   where  C^  denotes  the  value  of 
the  censoring  variable  associated  with  pair  (X-^X^^).   In 
the  case  of  random  right  censoring,  the  observed  sample 
consists  of  X^  =  minCX^.C^)  and  X2i  =  min(X2i  ,  Ci )  .   These 
pairs  can  be  classified  into  four  pair  types  which  are 


Pair  Type 
1 

2 

3 

.   4 


Description 

Xii<Ci>  X2i<Ci 

Xii<Ci>  X2i>Ci 

Xii>Ci>  X2i<Ci 

Xii>Ci>  X2i>Ci 


Consider  the  following  orthogonal  transformation  applied  to 
the  observed  sample: 


11  "  Xli  +  X2i   and   Y2i  =  Xli  "  X2i  for  i=1>2' 


Notice  that,  due  to  censoring  in  type  2,3,  or  4  pairs,  the 

T  I 

true  values  of  Y,.  and  Y2i  (denoted  Y,.  and  Y-j  ,  i.e.,  the 
values  had  no  censoring  occurred)  are  not  actually 

observed.   The  following  table,  Table  2.1,  summarizes  the 

i         i 

relationship  of  the  true  values  of  Y, .  and  Y0 .  to  the 

*  1 1        2i 

observed  values . 
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Table  2.1   Summarizing  the  Relationship  Between  the  True 
Values  of  Y,  .  and  Y0  .  to  the  Observed  Values 


li 


■21 


Pair 
Type 


Description 

xli<c1,  x2.<Ci 
x1.<c.J  x2i>c. 

Xli>Ci»  X2i<Ci 

xli>ci,  x2i>ci 


Relationship 
.  Between 


and  Y 


li 


Yli  =  Y 


li 


Yli  >  Yli 
YIi  >  Yli 


li 


>  Y 


li 


Rela t  ionship 

?  Between 
Y0,  and  Y, 
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21 


Y2i  =  Y2i 

Y2i  <  Y2i 
Y',  >  Y2i 

uncertain 
(i.e.,  Y2i  >  Y2. 


or  Y 


21 


Y2i) 


The  modified  Kendall's  tau  (denoted  CD  for 
concordant-discordant)  can  now  be  defined  as 


CD  = 


1    v 
)   a.  .  b.  .  where  for  i<i 


1   if  Y.  .  t    Y,  . 

-1   if  Y.  .  3    Y.  .     and 
1J    1  li     lj 


0   if  uncertain  of 
the  relationship 


lj 


if  Y0,  2  Y_. 

2i     2j 


=  <~1   if  YOJ  $ 


21 


'2j 


0   if  uncertain  of 
the  relationship 


(2.2.1) 
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(here  Y, .  K    Y,  ■     can  be  read  as  "Y, .  is  definitely  smaller 


than  Yj,"). 


.•  th 


For  example,  if  the  i    pair  is  a  type  1  and  the  j 


th 


pair  is  a  type  2  and  it  was  observed  that  Y,.  <  Y,  .,  then 
a±.     =  1  since  Yu  =  Y^  and  Y^  <  yJj  (thus  y[±    <    y[  ,  )  .   If 

Y,.  >  Y,  •  had  been  observed,  then  a.  .  =  0,  since  the 

t         t 
relationship  between  Y, .  and  Y, .  is  uncertain.   Similarly, 

if  Y2i  <  Y2, ,  then  b±.    =  0,  since  Y2±  =  Y2i  and  Y^,     <    Y2 • 

I  T 

(thus,  the  relationship  between  Y2j  and  Y2 ^  is  uncertain). 
On  the  other  hand,  if  Y2-  >  Y2-  had  been  observed,  then 
b.  •  =  -1  (by  a  similar  argument). 

Table  2.2  summarizes  the  necessary  conditons  for  a. . 
and  b. .  to  take  on  the  values  of  -1,  1  or  0.   The  product  of 
a^.  and  b^.  results  in  a  value  of  1  if  the  i    and  j1"   pairs 
of  the  transformed  data  points  are  definitely  concordant,  a 
value  of  -1  if  the  pairs  are  definitely  discordant  and  0  if 
it  is  uncertain.   If  the  i    pair  is  a  type  4  (i.e.,  both 
X, .  and  X2-  were  censored)  then  b. .  will  always  be  0  since 
the  relationship  between  the  i    and  j    pair  is  always 
uncertain  regardless  of  the  j    pair's  type.   Thus,  type  4 
pairs  always  contribute  0's  in  the  sum  for  CD.   Notice,  also 
in  the  case  of  no  censoring  this  modified  Kendall's  tau 
reduces  to  the   Kendall's  tau  applied  to  the  transformed 
data,  the  statistic  investigated  by  Kepner  (1979). 
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Table  2.2   Summarizing  the  Values  of  a. •  and  b. .  for  i<j 


a  .  . 
ij 


=  1  : 


if  Y, .  <  Y, .  and  one  of  the  following  occurs, 


i    pair  type 

1 
1 
1 
1 


j  £   pair  type 


a  .  . 


if  Y, .  >  Y,  .  and  one  of  the  following  occurs, 


i    pair  type 

1 
2 
3 
4 


j11   pair  type 


aii  =  ^:   ^or  a^  otner  cases 


bU 


=  1 


if  Y2J  <  Y2 ^  and  one  of  the  following  occurs 


i    pair  type 


jfc   pair  type 


bjLj  =  -1 


if  ^2±    ■*  Y2i  an^    one  °^  tne  following  occurs 

th 


i    pair  type 


j    pair  type 

1 
2 
1 

2 


b.  .  =  0:   for  all  other  cases 
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Next,  we  establish  some  properties  of  the  CD  statistic. 


Lemma  2.2.1:   Under  H  , 


and 


E(CD)  =  0 


-  2) 


Var(CD)  = a  +  — 

n(n  -  1)      n(n  -  1) 


where 


and 


a  =  4P(a. .=  1 ,b. .=  1) 


Y  =  2P(a..=  l,b..=  l,a..,=  l,b..,=  1) 
'        ij      ij      ij       ij 

+  2P(a..=  -l,b..=  l,a..,=  -l,b..,=  1) 
ij     '  ij      ±3  ij 

+  4P(«±J-  l.b^-  -l,a..f=  -l.b±Jf-  1) 

-  2P(a±J-  l,b±j-  -l,aij(=  l,b..,=  1) 

-  2P(.±J-  -l,b..=  -l,a..,=  -l,b..,=  1) 

-  4P(ai.=  l,b..=  l,.lj(-  -l,bijt=  1) 


(2.2.2) 


Proof 


Throughout  this  proof,  Theorem  1.3.7  in  Randies  and 
Wolfe  (1979)  will  be  used  extensively  and  thus  its  use  will 
not  be  explicitly  indicated. 

Under  H„ 


!         I 


^Xli  >X2i  'Xl  j  'X2j  ,Ci  'Cj  ^  "  (-X2i 'Xli 'X2j  'Xl  j  'Ci 'Cj  ^ 
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and  therefore  it  follows  that 

(Xli'X2i'Xlj 'X2j  '6i'6j)  =  (X2i'Xli'X2j  »X1  j  »f (5i)  >f (6j}) 

(2.2.3) 
where 

Xu  =  minCxJ-.C.), 

i 

X2i  =  min(X2i ,C±) , 


6i  indicates  what  type  of  pair  (X. j.X^. )  is, 


and 


fCS^)  indicates  what  type  of  pair  (xoi'Xli^  is 
Thus,  f(»)  is  the  function  defined  below. 


h. 

i 

2 
3 

4 


f<6±) 


Let  Yl±    =  X1±  +  X2i  and  Y2i  =  X1±  -  X2i;  thus  from  (2.2.3) 


(Yli,Y2i,Y1.,Y2j,6i,6j)  =  (Yli,-Y2.,Ylj,-Y2j,f(6i),f(6j)). 


Applying  the  definition  of  a.,  and  b . .  in  (2.2.1)  (or  usinj 
Table  2.2)  to  the  above,  it  follows  that 


and  thus 


(a,  .  ,b.  . )  2  (a.  .  ,-b.  .  ) 


P(aij  =  l.^Ij  =  1>  "  P<aij  "  l.bij  =  "I) 
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and 


P(ai;j  =  -l,bi:j  =  1)  =  P(aij  =  -l.b.j  =  -1)   . 


(2.2.4) 


Now , 


(CD)  =  -i-    I    E(a.  .b.  .  ) 


where 


E(a.jb.j)  =  (l)P(a..b..  =  1)  +  (-l)P(a..bij  =  -1 ) 
=  PCa.j  =  l,bi.  =  1)  +  P(a±j  =  -l,b..  =  -1) 


-  P 


(a±j  =  l,bij  =  -1)  -  P(«±j  =  -l,bij  =  1)   . 


Applying  (2.2.4)  to  the  above,  it  follows  that  E(a   b.j.)  =  0 
and  thus  E(CD)  =  0. 

Note,  that  under  H, 

(X11,X2i»xij»x2j»6l»6j)  =  (Xlj 'X2j »xii»x2i»6j '6i) 
and  thus 


<Yli'Y2i>Ylj'Y2j>6i>V  =  (Ylj'Y2j'Yli'Y2i»5j'6i)   ' 

(2.2.5) 

.pplying  the  definition  of  a,  ,  and  t^.  as  before,  it  follows 


(aij>bij>  "  ^aij'-bij) 


and  also 
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and 


Now , 


Var(CD)  =  [- 


P(atJ  =  l.b^  =  1)  =  PCa.j  =  -l,bij  =  -1) 
PCa.j  -  l,blJ  =-  -1)  =  PCa.j  =  -l,bij  =  1) 

—  ]  Var(  T  a.  .  b.  .  ) 

,    2 
[  ~"  ]    I     I       Cov(a.  .b.  .  ,  a.  ,  .  ,b.  ,  .  ,  ) 


The  three  possible  cases  to  consider  for  the  covariance  arc 

1)  1*1 ',  j/j',  i<j,  i»<j- 

2)  i=i' ,  j-j * ,  i<j ,  i»<j • 
and 

3)  where    exactly    two    of     the    four     subscripts 
i<j     and    i  '  <  j  '     are    the    same. 

Case    1)     ijti  '  ,     j  jtj  '  : 

In  this  case,  Cov ( ai . bi . ,  a1?.,bi,.,)  =  0  since  the 
bivariate  pairs  are  i.i.d. 


Case  2  )  i  =  I ' ,  j  =  j  * : 
In  this  case  , 

CovCa.jb.j,  a.jbij)  =  EKa^b^)2] 

=  P(a..  =  l,b..  =  1)  +  P(a..  =  l,b..  =  -1) 

+  PCa.j  =  -l,b±J  =  -1)  +  PCa.j  =  -l.b.j  =  -1) 

=    4    P(aij    =    1 , b±j    =    1)    =    4a  (by    part    a). 
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Case  3)   Exactly  two  of  the  four  subscripts  i<j ,  i ' < j '  art 
the  same. 
Now , 

Cov(a1Jblj,  a.kbik)  =  E(aljbljalkblk)   . 

Define  the  following  events: 

Al,l:  {aij  =  l»    aik  =  X> 

Al,-1:  <aij  =  L>  aik  =  -!> 

A-l,l:  Uij  =  _1'  aik  =  1} 

A-l,-l:  {aij  =  "!»  aik  =  -H 


and  similarly  define  the  events  B,  ,  ,  B,  _,  ,  B,  ,  and 

B_^  _1  .   Using  this  notation,  E  ( a.  ^  .  b^  .  a.«ubik )  can  be  written 


E(aijbijaikbik)  = 

I    I    I    I   (-Dk+£+m+nP(A     ,         ,B     m        >   . 
k=0  £=0  m=0  n=0  (-1)  ,(-1)*    (-1)  ,(-l)n 

(2.2.6) 


Table  2.3  describes  the  events  A  and  B 

(-1)  ,(-l)£        (-Dm,(-l)n 


in  more  detail  and  the  restrictions  placed  on  the  6's. 

Now,  to  simplify  the  probabilities  in  (2.2.6).   Note, 

under  H 
o 

(Xli'X2i,Xlj ,X2j ,Xlk'X2k'6i'6j '6k} 
=  ^2.,Xli,X2.,X1.,X2k,Xlk,f(6.),f(6j),f(6k)   . 
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Applying  the  transformations 


Yli  =  Xli  +  X2i  and  Y2i 


Xli    X2i 


it  follows  that 


(Y1.,Y2i,Ylj  ,Y2j  ,Yllc,Y2k>6i,6j,6k) 
■  (Yli,-Y2i,Ylr-Y2j,Ylk,-Y2k)f(5.),f(6j),f(5k))   . 

Now,  applying  the  definitions  of  a. .  and  b . .  in  (2.2.1)  (or 
using  Table  2.2),  notice  that  if  b±,    =  1  (i.e.,  Y2i<Y2,, 
6ie(l,2)  and  6,e(l,3),  then  -Y±,  >  -Y±*  ,     f(5i)e(l,3)  and 
f(5.)e(l,2)  which  would  yield  b..  =  -1. 
Using  similar  arguments,  it  follows 


(a 


..,a.,,b..,b.,)  =  (a..,a.,,-b..,-b.,) 
ij  '  ik'  ij  '  ik'    v  ij  '  ik'   ij  '   ik' 


and  thus 


PCA1§1  ,B1(1)  -  P(A1}1  .B.!   x) 


P(A 


P(A 


1 ,1  'B-l ,1 


-1,-1   '°1, 
1 


P(A-1,-1  «B- 


P(A 


-1,1  »"l,l 


P(A-1,1  'B-l, 


P(A 
P(A 


1,-1  »B1,1 


1,-1  'D-l, 


=  P(Alfl  .B^.p 
)  =  P<A.1M  ,B.lf-1) 
x)  -  PU-i.-l  .B^) 

=  P(A.1(1  .B.^-!) 
)  =  P(A.lfl  .B^.p 

=  PCA^.j  .B.lf_i) 
)  =  PCA^.i  .B!   x) 
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CO      CM      co 


CMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCM 

>l>-l><>H>4>-l>l>-l>l>-l>->H>i>-l>J>J 

VVVV/\VV/\.V/\/\VAA/\/\ 


CM        CM        CM        CM        CM        CM        CM 


CM        CM        CM 


CMCMCMCMCMCMCMCMCMCMCMCMCMCMCMCS 

>-l>H>-l>l>-l>-l>H>-l><>-(>H>l>J>l>l>H 

VVV/\VVAV/\VA/VV/\AA 


CMCMCMCMCMCMCMCM 


CM 


CM        CM        CM        CM 


CM        CM 


CM 


>H>H>-I>H>I>-1>I>H>-I>-I>H>-I>I>J>JJM 
>"'>^>-'>-'>-<>-'>-lS>-<>-l>-«>-t>H>-lt>-'>-l>-. 

VVAVVAVVAAVAAVAA 

•rli-lf-l-H>Hi-)iH-H-H-Hi-liH-H-rt<HfH 

>-l>-l>l>-l>->H>-l>-l>-l>-l>-l>-lt)-l>-l>-lfM 

,r" )       "T— J       i~l      **"">       •»—)       *<~)       •!")      "i-)       1—)       -I— )       •!— )       »r- )       «rn       •!—)       t— j       •?— j 

>I>-I>-1>I>H>I>H>-I>I>H>-I>I>I>H><>( 

VAVVVAAAVVV/\AAVA 
T(i-l'H>i-li-l>rli-liH«H<Hi-li-l'H-H-Hf-l 


- — >  — «  " — I  — '  I 

-1         -I         -I         -H         -1         -1  -  I  I  I 

-I         -H         -H         -.  •>  I  -I  1-1  «.«_, 

«    ^     _     _       «m        i-Hi^H^Moa        i       i    -q 

~<    CO       CO  l^n^HpapQCQqa  I        -h  -h    PQ       C3  —i 

pa       ^-npopQ        i^^^H^pa        |  i^^i 

' — •  1  ■ — "         ' — <         ■ — ■  I  I  |  ,H         -4         _4  |  I  T         —         T 
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Similarly,  under  H 


(Xli»X2i'Xlj'X2j'Xlk'X2k'6i'<5j'6k) 
=  (xii»X2i'Xlk'X2k'Xlj'X2j'6i'6k'5j) 


and  applying  the  definition  of  a.,  and  b..  in  (2.2.1)  it 
follows  that 

(alj,alk,bij,bik)  =  (aik,a.j)bik,b.j)   . 
This  yields  that 


P(A-1,1  >B-1,1>  =  P(A1,-1  'Bl,-1> 

P(A-1,1  >B1,-1>  "  P<A1,-1  'B-l,l> 

P(A1,-1  »B1  ,  1>  =  P<A-1,1  'Bl,l> 

PC*!,-!  .B.1(-1)  =  HA.lfl     .B.^.p 


Thus,  ^ ( a .  •  b^  .  a . kb . k )  can  be  reduced  to  a  sum  of  six  tern 
instead  of  the  original  sixteen;  i.e. 

E(aijbijaikbik)  = 

I    I    I    I  (-Dk+A+m+nP(A  ,B  ) 

k=0  £=0  m=0  n=0  (-1)  ,(-1)*    (-1)  ,(-1) 

=  2P(Alfl  ,Blfl)  +  2P(A_1)_1  ,Bl)1)  +  4P(A1)_1  ,3.^) 


"  2P(A1,1  'B-l,l>  "  2  P(A-1,-1  »B-1.1>  "  4P(A1.-1B1.1^  =  Y 


Note,  the  subscripts  are  arbitrary;  thus 
E(a..bi;jaikbik)  =  E(a..b.jakjbk.)  =  E  ( a  .  b±  a  kb  fc ) 
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and  therefore  combining  the  results  from  case  1,  2  and  3,  it 
follows  that 


Var(CD)  = 


"I 


n(n  -  1) 


,  4(n  -  2) 

a  +  — — rr-  Y 


n-2 
2-1 


Y) 


n(n  -  1) 


As  seen  in  Lemma  2.2.1,  the  variance  of  CD  depends  on 
the  underlying  distribution  of  (xii»X21^  and  possibly  C. 
Therefore,  CD  is  not  distribution  free  under  H  .   Section 
2.3  will  discuss  a  permutation  test  based  on  CD  that  is 
conditionally  distribution  free.  This  test  is  recommended 
for  small  samples.   For  larger  samples,  Section  2.4  presents 
the  asymptotic  normal  distribution  of  CD  using  a  consistent 
estimator  of  the  variance.   This  result  can  be  used  to 
construct  a  distribution  free  large  sample  test  based  on  CD. 


2.3   Permutation  Test 


In  the  situation  where  the  sample  size  is  small,  a 
permutation  test  based  on  CD  is  recommended.   What  is 
considered  a  small  sample  size  will  be  discussed  in  Chapter 
Five  when  the  Monte  Carlo  results  are  presented.   Now,  we 
will  develop  the  motivation  for  the  permutation  test. 

Recall,  under  H 

(xli,x2i,c1)  »  (x2i,xli,ci) 
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and  thus 


where 


(X11,X2i,5i)  =  (X2i,Xli,f(6i)) 


(2.3.1) 


X^  =  min(  X^  ^  ,  C^ )  ,  X2i  =  min(  X2i  ,  C.^  )  ,  6  ^  is  the  pair  type 
(i.e.  6.  =  1,2,3  or  4)  and  f  ( <S  .  )  is  a  function  such  that 


Let  k  =  ■{    be  an  operator  such  that 


f(5±) 

1 
3 
2 
4 


^Xli'X2i'^i^   ~  * 


(Xli,X2i,6i)      if  k  =  1 
(X2i,X1 . ,f (6i))   if  k  =  0 


and  K  =  {k:  k  is  a  1  x  n  vector  of  O's  and  l's}  (of  which 
there  are  2n  different  elements).   Thus,  applying  this 
operator  to  (2.3.1),  we  see  under  HQ,  P{ ( X, . , X«j , 6^)  = 
(Xu,X2i,6i)0}  =  P{(X1.,X2.,6i)  =  (Xli,X2.,6.)1}  .   Applyinj 
this  idea  to  the  entire  sample  (in  which  the  observations 
are  i.i.d),  under  HQ ,  it  follows  that 


(  (  Xj  i  >  X2i  >  5i  )  1  y  (  xi2  »  X22  '  ^  2  ^  2  '  *  *  *  »  *"  Xl  n  '  X2n  '  5n  ^  n^ 


-  { (Xj j ,X21 , 5j )  1 , (X.2 ,X22 , 62)  2,...,(X,n>X2n,6n)  n} 

(2.3.2) 
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where  k  and  k'  are  arbitrary  elements  of  K.   Therefore, 
under  H  ,  given 

{(x11,x21,61),(x12,x22,62),...,(xln,x2n,6n)}, 
the  2n  possible  vectors 

{(x11,x21,61)kl,(x12,x22,62)k2,...,(xln,x2n,6n)kn} 
are  equally  likely  values  for 

{(X11,X21,61),(X12,X22,<52),...,(Xln,X2n,6n)} 

The  idea  of  the  permutation  test  is  to  compare  the 
observed  value  of  CD,  for  the  sample  witnessed  to  the 
conditional  distribution  of  CD  derived  from  the  2n  equally 
likely  possible  values  of  CD  (not  necessarily  unique) 
calculated  from 

{(x11,x21,61)kl,(x12,x22,62)k2,...,(xln,x2n,(5n)kn} 

Note,  since  the  sample  observed  is  censored,  the  2n 
vectors  {x11,x21,61)kl,(x12,x22,62)k2,...,(xln,x2n,5n)kn} 
are  not  necessarily  unique.   If  a  pair  is  a  type  4  (i.e., 
both  X^  .  and  X2  •  were  censored),  then  (x ,  •  ,  x2  •  ,  6  ..  )   = 
(x1  .  ,x2 .  ,  6  .  )°  .   In  fact,  there  are  only  2^n-n4'  unique 
vectors  (n^  =  number  of  type  4  pairs),  since  P(X,.  =  X2-)  = 
0  if  (xii»^2i^  -*-s  not  a  cyPe  ^  pair  under  assumption  A2.   As 
a  result,  the  permutation  test,  in  effect,  discards  the  type 
4  pairs  (since  ai-jb1.  =  0  if  the  i    or  j    pair  is  a  type 
4)  and  treats  the  sample  as  if  it  were  of  size  n-n,  with  no 
type  4  pairs  occurring. 

With  regards  to  the  transformed  variables  (Y,.,Y2-) 
i  =  1,2,  ...,n,  the  permutation  test  can  be  viewed  in  the 
following  way.   Consider  the  transformations  Y, .  =  X, .  +  X„. 
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and  Y2i  =  X1JL  -  X2i.   Applying  these  to  (2.3.1)  and  (2.3.2), 
we  see  that  under  H 


^Yli'Y2i'6i)  =  (Yli»-Y2i'f(5i>> 


and  similarly , 


k  k  k 

{(Yn,Y21,5.)  1,(Y12,Y22,62)  %...,(Yln,Y2n,6n)  n} 

i  i 

A  1  9  i 

=  {  (  Y,  ,  ,  Y,  ,  ,6  ,  )   , ( Y, 9 , Y99  ,  69  )   ,...,(Y,   ,Y9  , 


11  '  x21  ,ul 


12'  "22  'u2 


■ln»l2n,0n)   } 


where 


(Yli,Y2.,6i)  *-      < 


(Y1.,Y2.,6i)       if  k  =  1 
(Y11,-Y211f(6±))  if  k  =  0 


and  k  and  k'  are  arbitrary  elements  of  K.   That  is,  under 

HQ,  given  { (y : 1  ,y 2 1 , 6 x ) , ( y 1 2 , y2 2 , 6 2 ) , . . . , (y x n , y2n , 6n) } ,  the 

2n  possible  vectors 

kl  k2  kn 

Hyn  ,y21 ,6A)   ,(y12  ,y22,62)    ,  •  •  • ,  (yin»y2n'  6n)  "}  are 

equally  likely  values  for 

{(Yu  ,Y21,«1)  ,(Y12,Y22,<52)  ,  ...  .(Yln,Y2n,6n)}   . 
To  perform  the  permutation  test,  the  measurements 
(x,.,x2i,5.)  i  =  1,2,. ..,n  are  observed  and  the 
corresponding  value  of  CD  is  calculated.   Under  H  ,  there 
are  2n  equally  likely  transformed  vectors  for 
{(Y11,Y21,61),(Y12,Y22,62),...,(Yln,Y2n,6n)} .   The  CD 
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statistic  is  computed  for  each  of  these  possible  vectors  and 
from  this  the  relative  frequency  of  each  possible  CD  value 
is  determined.   The  null  hypothesis  is  rejected  if  the 
original  observed  CD  value  is  too  large  or  too  small  when 
compared  to  the  appropriate  critical  value  of  this 
conditional  distribution. 


2.4   Asymptotic  Results 

In  Section  2.3,  a  permutation  test  was  presented  to 
test  H  ,  when  the  sample  size  was  small.   In  larger  sample 
sizes,  the  permutation  test  becomes  impractical  and  time 
consuming.   In  these  situations,  the  asymptotic  results 
which  will  be  presented  in  this  section  could  be  employed. 


Theorem  2.4.1:   Under  HQ , 


where 


CD 


VarCCD)]^ 


— >    N(0,1)  as  n  +  », 


Var(CD)  = 


n(n-l) 


4(n-2) 
n(n-l) 


Proof 


Note  that  CD  is  a  U-statistic  with  symmetric  kernel 
h($i»?j)  =  ^  ai  i  bli  ^  *   Thus»  by  applying  Theorem  3.3.13  of 
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Randies  and  Wolfe  (1979),  it  follows  that 


d        4?1 
CD  — -►  N(0,  -)   as  n 


where 


Note  that 


?1  =  E[h(Xi,Xj)h(X. ,Xk)] 

■  E[aijbij  aikbik]  "  Y   • 

4y 


2         4(n-2) 

a  +  —, r^r  Y 


— ■*■  1  as  n  >  oo , 


n(n-l)  u     n(n-l) 


therefore  after  applying  Slutsky's  Theorem  (Theorem  3.2.8, 
Randies  and  Wolfe,  1979) 


CD 


[Var(CD)]/2 


■>   N(0,1)  as  n  * 


D 


Corollary  2.4.2:   If  Var(CD)  is  any  consistent  estimator  of 
Var(CD),  then 


CD 


*       1/ 
Var(CD)l/2 


— ->•   N(0,1)  as  n  > 


Proof  : 

This  follows  directly  from  Theorem  2.4.1  and  Slutsky's 
Theorem.  *— ' 


Next,  we  consider  the  problem  of  finding  a  consistent 
.stimator  for  Var(CD).   There  are  many  consistent  estimators 
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for  a  variance,  but  three  which  worked  well  in  the  Monte 
Carlo  study  are  described  in  the  following  lemma. 
Lemma  2.4.3:   Under  H  ,  the  following  are  consistent 
estimators  of  Var(CD): 

1)  Varl(CD)     =    n1    {T7^      1,1       I      ^ijk*' 

3[3J     Ki<j<k<n         J  J 

where 
AijkBijk    =    (aijbijaikbik    +    aikbikajkbjk    +    aijbijajkbjk)' 

2)  Var2(CD) 

=    i{ l 2-H       l       l    ^ijkBijk    +    l       I     (a       b        )2]     -     (CD)2}, 

n         n(n-l)Z     Ki<j<k<n    1J  K    1J  *       Ki<j<n    1J     1J 


and 

3)Var3(CD) 


"T —  {[7h    l    l    (aiibij)2]  "  (CD)2} 

n(n-l)    (")  l<i<j<n   iJ  1J 


4(n-2)  rn   *   ,  „_.  \ « 

— {-  Var2(CD)} 

n(n-l)   4 


Proof : 


First,  it  will  be  shown  that  nVar,  (CD)  — *■-►  4y. 
Now , 

nVari(CD)  =  4{— i-   J   I   I   AiikBiik} 
3f")  Ki<i<k<n   1Jk  1Jk 


3 
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=  4<77rr    I    I    I 


A.  ..  B,  .. 
ijk    ilk 


3J     Ki<j<k<n  3 


which  shows  that  nVar, (CD)  =  4U   where  U   is  a  U-statistic 

J-  n         n 

of  degree  3  with  symmetric  kernel  h   =  A...B...  /3.   Thus,  ii 

ij  K  i j  k 

follows  that  nVar^CD)  — *  4y  since  U   — 2-»-  y  by  Hoeffding's 
Theorem  (Hoeffding,  1961). 

Next,  it  will  be  shown  that 

n(Var2(CD)  -  Var^CD))  — £+  0  as  n  *  ». 
First  though,  notice  that  Var2(CD)  is  equivalent  to 


(n-2) 


4r  ,        2 


i ^Var,(CD)  +  -{  [ ± £   £  (a.,b..)^]  -  (CD)'}   . 

(n-1)  n   n(n-l)ZKi<j<n  1J  1J 


Thus  , 


=  A{ 


n(Var2(CD)  -  Var^CD))  =  n{("  2)  -  1}  Var^CD) 

(n-1  ) 

+  *£[ 2— J  I  I       (a±.b   )2]  -  (CD)2} 

n(n-l)^  Ki<j<n    1J  1J 

L^1   ~     U  Un  +  4{t LT nT    ^    I       (a   b   )2]  -  (CD)2} 

(n-1)        n       (n-1)  (?)   Ki<j<n    ij  1J 


-*♦  0 


as  n  -v  », 


Therefore,  Var2(CD)  is  a  consistent  estimator  for  Var(CD) 
Lastly,  it  will  be  shown  that 

n(Var3(CD)  -  Var2(CD))  -£+  0  as  n  +  » . 


Now , 
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n(Var3(CD)  -  Var2(CD)) 


-L-  ^T^T    I  I     Ca   b   )2]  -  (CD): 

(n-1)    f"    Ki<j<n   1J  1J 


+  { (n  2)  -  1}  nVar2(CD)  -£+  0  as  n  > 
(n-1) 


D 


Next,  we  provide  a  brief  explanation  of  each  of  these 

estimators.   As  was  shown  in  the  proof  of  Lemma  2.4.3, 

4   *         * 
Var,(CD)  =  —  U   where  U   is  a  U-statistic  which  estimates  y 
1         n   n  n 

Thus,  Var,(CD)  is  estimating  the  asymptotic  variance  of 
CD.   Var2(CD)  is  also  estimating  the  asymptotic  variance  of 
CD,  but  in  a  slightly  different  manner.   Recall,  from  basic 
U-statistic  theory  that  y  is  the  variance  of  a  conditional 
expectation  (Randies  and  Wolfe,  1979,  p.  79)  (i.e., 
Y  =  VarUajbj)*]  where  (ajbj)*  =  E  [  a  j  2b  l  2  |  (  Y  ]_  {  ,  Y2  l  )  ]  )  . 
Thus,  in  Var2(CD),  for  each  (Yii»Y2:j),  the  conditional 
expectation  is  estimated  using  all  the  other  (Y,  .,Y~.)'s, 
j*i  and  then  the  variance  of  all  these  quantities  is 
calculated.   That  is, 


l  *       0 

Var,(CD)  =  -   V  {(a  .b  .  )   -  CD} 

n  i  =  I 


where 


(aiV*  ■-—  .J.E[aijbijl^li^2i)]' 
n-1  j*i     J   J 


In  contrast  to  Var^CD)  and  Var2(CD),  VarJCD)  is 
estimating  the  exact  variance  of  CD  (2.2.2)  derived  in 
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Section  2.2.   It  is  using  an  estimator  of  y    from  VarjCCD) 
and  estimating  a  with  a  difference  of  two  U-s t a t is t ics  which 
is  estimating 

Again,  although  under  H  ,  E(a.  .b.  .)  =  0,  the  sample 
estimate  for  E(a..b..  )  (i.e.,  CD)  was  left  in  to  possibly 
increase  the  power  of  the  test  under  the  alternative. 

Each  of  these  variance  estimators  will  be  considered  in 
the  Monte  Carlo  study  in  Chapter  Five.   Although  the 
calculations  look  overwhelming  if  performed  by  hand,  they 
are  all  easily  programmed  on  the  computer.   (See  the  CDSTAT 
subroutine  in  the  Monte  Carlo  program  listed  in  Appendix  2.) 


2 . 5   Comment  s 


This  chapter  has  presented  a  statistic  to  test  the  null 

hypothesis  of  bivariate  symmetry  versus  the  alternative  that 

the  marginal  distributions  differ  in  their  scale 

parameters.   For  small  samples,  a  permutation  test  is 

recommended.   A  basic  disadvantage  of  this  is  that  it 

generally  requires  the  use  of  a  computer  for  moderate  sizes 

(otherwise  it  is  very  time  consuming  to  derive  the  null 

distribution).   For  larger  sample  sizes,  it  is  recommended 
CD 


that 


A       1/ 
Var(CD)]  2 


be  used  as  an  approximation  for 


CD 


[Var(CD)]/2 


Thus,  for  an  a  level  test  using  the 
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asymptotic  distribution,  the  null  hypothesis  would  be 
CD 


rejected  if   j 


[VarCCD)]^ 


>  Z     where  Z  /0  is  the  value  in 

/•>         a/2 
a/  2 


a  standard  normal  distribution  such  that  the  area  to  the 
right  of  the  value  is  a/2. 

Chapter  Five  will  present  a  Monte  Carlo  study  which 
uses  the  asymptotic  normal  distribution  of  CD  (with  a 
consistent  variance  estimator)  to  investigate  how  well  the 
test  performs  under  the  null  and  alternative  hypotheses. 
First  though,  some  comments  on  this  chapter. 

Comment  1 


One  possible  advantage  of  the  CD  statistic  is  the  fact 
it  utilizes  information  between  censored  and  uncensored 
pairs  whenever  possible.   In  the  permutation  test,  type  4 
pairs  have  no  effect  on  the  outcome  of  the  test.   That  is, 
they  can  be  ignored,  treating  the  sample  as  if  it  were  of 
size  n,+n-+no.   This  is  understandable  since  X, .  =  X^.;  =  C. 
and  thus  they  supply  no  information  about  the  scale  of  X.  , 
relative  to  X2 1 . 

In  the  asymptotic  test,  if  one  estimated  the  variance 

in  (2.2.2)  by  estimating  a  and  y  with  their  sample 

2  2 

quantities  (for  example,   a  =  £   £  ( a   b   )   ),  it 

n(n-l)  Ki<j<n   J   J 
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is  easily  shown  that  the  type  4  pairs  have  no  effect  on  the 
value  of  the  test  statistic.   That  is,  the  value  of  the  test 
statistic  remains  the  same  whether  the  type  4  pairs  are 
discarded  or  not.   If  a  different  estimate  for  the  variance 
is  used,  there  is  a  slight  change  in  the  test  statistic's 
value  if  type  4  pairs  are  discarded,  due  to  the  different 
variance  estimator.   Asymptotically,  this  difference  goes  to 
zero,  due  to  the  fact  the  variance  estimates  are  all 
estimating  the  same  quantity.   Thus,  in  some  sense,  the 
asymptotic  test  behaves  similarly  to  the  permutation  test 
with  regards  to  type  4  pairs. 

If  a  and  y  are  known,  they  are  a  function  of  whether 
type  4  pairs  are  included  or  not.   That  is,  if  type  4  pairs 
were  not  included  in  calculating  the  test  statistic  (thus 
n=n,+n2+n,),  the  value  for  a  and  y  would  be  larger  than  the 
value  had  type  4  pairs  been  included  (since  type  4  pairs 
only  contribute  O's  and  never  l's  or  -l's).   The  effect  of 
type  4  pairs  on  a  and  y  is  such  that  the  test  statistic's 
value  would  be  the  same  (or  at  least  asymptotically  the 
same)  whether  type  4  pairs  were  discarded  or  not. 

Comment  2 


A  disadvantage  of  the  test  is  that  for  small  samples  CD 
is  not  distribution  free.   Thus,  the  permutation  test, 
conditioning  on  the  observed  sample  pairings,  must  be 
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performed  to  achieve  a  legitimate  distribution  free  a-level 
test . 

Comment  3 


It  is  unclear  how  the  CD  statistic  would  be  affected  if 
the  marginal  distributions  of  X,,  and  X~i  have  different 
locations.   It  is  possible  that  the  assumptions  made  on  the 
censoring  distribution  might  not  be  valid  (in  particular 
assumption  A4 ,  which  assumed  the  same  censoring  cutoff  for 
X,  .  and  X0 . )  or  even  if  this  is  true,  that  CD  does  not 
perform  well  in  these  instances.  Chapter  5  will  investigate 
this  problem  in  further  detail. 


CHAPTER  THREE 
A  CLASS  OF  TESTS  FOR  TESTING  FOR  DIFFERENCES  IN  SCALE 

3.1   Introduction 


In  the  previous  chapter,  a  test  statistic  was  presented 
to  test  the  null  hypothesis  of  bivariate  symmetry  against 
the  alternative  that  the  marginal  distributions  differ  only 
in  their  scale  parameters.  A  shortcoming  of  the  statistic 
was  the  fact  the  variance  of  CD  depended  on  the  underlying 
distribution  and,  thus,  for  a  small  samples  a  permutation 
test  had  to  be  done  or  for  large  samples  the  variance  had  to 
be  estimated.   In  this  chapter,  two  test  statistics  will  be 
presented  which  are  nonparame t ri cally  distribution-free 
(conditional  on  N,  =  n,  and  N   =  n^+no)  for  all  sample  sizes 
to  test  the  null  hypothesis  of  bivariate  symmetry.   The 
alternative  hypotheses  are  structured  by  assuming  the 
samples  come  from  a  bivariate  distribution  with  c.d.f. 


F( 


x„  -  p 


1 


)  where  F(u,v)  =  F(v,u)  for  every  (u,v) 


in  R  .   Tests  are  developed  for  both  of  the  following 
alternatives  to  the  null  hypothesis  of  bivariate  symmetry: 
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Case  1  .  \i  ,  =  y  2  known, 

HQ  :  a  i    =  02   and   Ka :   o,  <  02 
That  is,  the  marginal  distributions  have  the  same  known 

location  parameter  but,  under  H  ,  X~  ,  has  a  larger  scale 

t 
parameter  than  X,,.   A  possible  contour  of  an  absolutely 

continuous  distribution  of  this  form  was  given  in  Figure  2. 

Case  2 .   y,  =  y  ~  unknown, 

HQ  :   0^  =  o"2   and   H  :   a  ^    <  02 
Here,  the  marginal  distributions  have  the  same  unknown 
location  parameter  but,  under  H  ,  X~  ,  has  a  larger  scale 
parameter  than  X.,. 


(Note,  for  both  cases,  the  alternative  has  been  stated  in 

the  form  for  a  one  sided  test.   The  procedure  which  will  be 

presented  can  easily  be  adapted  for  the  other  one-sided  or  a 

two  sided  alternative.   The  latter  is  discussed  at  the  end 

of  this  chapter  .  ) 

In  Sections  3.2  and  3.3,  tests  statistics  for  Case  1 

and  Case  2,  respectively,  will  be  presented  which  are 

nonparamet  rically  distribution-free  conditional  on  N-.  =  n, 

and  N   =  n2+n^ .   In  both  cases,  the  test  statistics  can  be 

viewed  as  a  linear  combination  of  two  independent  test 

statistics  Tn   and  Tn  ,  where  T    is  a  statistic  based  only 
1         c  1 

on  the  n,  uncensored  observations,  while  T    will  be  a 
statistic  based  on  the  n   =  n^+n,  type  2  and  3  censored 
observations.   The  conditioning  of  the  random  variables  N, 
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and  N   on  n,  and  n2+n^  (respectively)  is  used  throughout 

Section  3.2  and  3.3  and,  thus,  this  condition  will  not 

always  be  stated  but  will  be  assumed  with  the  use  of  n, ,  r^ 

and  n7.   Thus,  the  test  statistics  will  be  written  as  T„ 

J        »  nl  »n( 

and  TMn   n   (for  Section  3.2  and  3.3,  respectively)  which 
1  '  c 

imply  conditioning  on  N,  =  n,  and  N   =  n   =  n2+n, .   Section 
3.5  will  consider  the  asymptotic  distribution  of  each  test 
statistic. 


3.2  \i  ,  =  y  j  >  Known 

This  section  will  begin  by  introducing  the  notation 

necessary  for  the  statistic  Tn   n   designed  for  the 

1  '  c 

alternative  in  case  1.   Recall,  the  sample  consists  of 

^Xli'X2i^  i=l>2,...,n  where  X,.  =  min(X,.,C.)  and 

i 
X2i  =  min ( X^^ , C^ ) .   These  pairs  were  classified  into  foui 

pair  types.   They  were  the  following: 


Pair  Type 
1 
2 
3 

4 


Pes  crip t ion 

xli<c1,  x2i<ci 

xli<ci,  x2i>ci 

xli>ci,  x2i<ci 

xli>ci,  x2i>ci 


Number  of  Pairs 
in  the  Sample 


where  n  =  n,  +  n0  +  tiq  +  n,.  . 
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For  convenience  and  without  loss  of  generality,  let  the 

type  1  pairs  occupy  positions  1  to  ni  in  the  sample  (i.e., 

{ (Xn ,X21 ) , (Xl2 ,X22) , . . . , (Xln  ,X2n  )}  )  in  random  order. 

Similarly,  the  type  2  and  type  3  pairs  will  be  assumed  to 

occupy  positions  n, +1  ,  n, +2  ,  .  .  .  ,  n,+n   in  random  order. 

Lastly,  the  type  4  pairs  occupy  positions 

n,+n  +l,n,+n  +2 , . . . , n .   What  is  meant  by  random  order,  is 
1   c   '  1   c   ' 

that  the  exchangeability  property  still  holds  within  the  n^ 
type  1  pairs,  within  the  iin+n,  type  2  or  3  pairs  and  within 
the  n,  type  4  pairs.   This  could  be  accomplished,  if  the 
pairs  were  placed  into  their  respective  grouping  (type  1,  2 
or  3,  or  4)  arbitrarily,  with  no  regard  to  their  original 
position  in  the  sample.   Much  easier,  from  a  researchers 
point  of  view,  would  be  to  place  the  pairs  into  their 
respective  groupings   in  the  same  order  they  occurred  in  the 
sample  (i.e.,  the  first  uncensored  pair  is  placed  into  the 
first  position  among  the  n,  uncensored  pairs,  the  second 
uncensored  pair  into  the  second  position,  etc.)  .   This 
procedure  would  not  affect  the  desired  exchangeability 
property,  as  deduced  from  the  following  argument.   In  using 
the  second  method,  the  reseacher  is  actually  fixing  the 
position  of  the  type  1  pairs,  type  4  pairs  and  type  2  or  3 
pairs.   Thus,  instead  of  n!  equally  likely  arrangements  of 
the  original  sample,  there  are  n^  !  n^  !  (  n2+n-j  )  !  equally  likely 
arrangements  when  the  positions  and  numbers  of  the  pair 
types  are  fixed.   Therefore,  it  follows,  that  each  of  the 
ni!  arrangements  of  the  n,  uncensored  pairs  is  equally 
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likely  and  that  the  exchangeability  property  still  holds 
within  the  type  1  uncensored  pairs.   Similar  arguements  for 
the  (n2+no)  type  2  or  3  pairs  and  the  n,     type  4  pairs 
hold. 

The  following  notation  will  be  used  in  the  statistic 
T   ,  a  statistic  which  is  based  on  the  ti,  type  1  pairs. 
Define  a  variable  Z.  to  be 


Z ,.  =   X 


2i 


X1±  -  p|  for  1-1,2, 


where  u  is  the  known  and  common  location  parameter.   Let  R^ 

be  the  absolute  rank  of  Z.  for  i=l , 2 , . . . , n, ,  that  is,  the 

rank  of   Z  .   among  {  Z,   ,  Zo  ,  . 
defined  as 


Z„   }  and  let  f.  be 
nl  '  x 


?±  =  V(Z±)     = 


1   if  Z±    >    0 


0   if  Z±    <    0 

Note,  the  variable  Z.  is  defined  only  for  the  uncensored 
pairs.   The  statistic  T    is  then 


t     =    v    t.  r, 
nl     i=l   x   x 


the  Wilcoxon  signed  rank  statistic  computed  on  the  Z.'s. 

Notation  will  now  be  introduced  for  the  statistic  Tn  , 

c 

a  statistic  based  only  on  the  type  2  and  type  3  censored 
pairs.   (The  pairs  in  which  only  one  member  has  been 
censored.)   Define  Q,  to  be  the  rank  of  C=  among 


{Cn1  +  l'Cn1+2""'Cn1  +  nc>  an< 
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1   if  the  j    pair  is  a  type  2  pair 
0   if  the  jc   pair  is  a  type  3  pair 


for  j  =  n, +1  ,n, +2  ,  .  .  .  ,n-i +n  .   The  statistic  T    is  defined 


•r1*"! 


■lTUc 


n,  +  n 
1    c 

I  Y.  Q. 

c    j=n1+l    J   J 


=  l    ranks  of  the  C's  for  the  type  2  p; 


A  brief  explanation  of  the  logic  behind  the  test 
statistic  will  be  presented.   For  the  test  statistic  T   ,  if 
X~  has  a  larger  scale  parameter  than  X,  (i.e.,  under  H  ), 


then   X 


21 


'1  '  "a 

X,.  -  y   should  be  positive  and  large. 


Thus,  the  test  statistic  T    would  be  large.   In  contrast, 

if  X2  and  X^  have  the  same  scale  parameter  (i.e.,  under  H  ), 

then  [ X2 ^  -  u   ~  j  X ,  .  -  y   would  be  positive  approximately 

as  many  times  as  negative  with  no  pattern  present  in  the 

magnitudes  of  j  X2  ^  -  y |  -  |xii  ~  V\  »       Thus  the  test 

statistic  would  be  comparatively  less. 

For  the  test  statistic  Tn  ,  if  H   is  true,  there  should 

nc       a 

be  a  preponderance  of  type  2  censored  pairs  (relative  to  the 

number  of  type  3  censored  pairs)  and  these  pairs  should  have 

the  more  extreme  censoring  values.   Figure  5  illustrates 

this  idea.   Thus,  the  test  statistic  T    would  be  large.   In 

c 

contrast,  if  H   is  true,  the  number  of  type  2  pairs  should 
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21 


11 


Figure  5.   Contour  of  an  Absolutely  Continuous  Distribution 
That  Has  Equal  Marginal  Locations  and  Unequal 
Marginal  Scales  with  Censoring  Present. 
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not  dominate  n   and  the  test  statistic  Tn   should  not  be 
c  c 

unusually  large. 

Now  we  establish  certain  distributional  properties  for 

T    and  Tn  . 
nl       nc 

Lemma  3.2.1:   Conditional  on  n1 ,  Tn   has  the  same  null 
distribution  as  the  Wilcoxon  signed  rank  statistic. 

Proof  : 


First  it  will  be  shown  that  conditioning  on  the  n^  type 
1  pairs  does  not  affect  the  exchangeability  property  (i.e., 

(Xli'X2i'Ci}  *  (X2i,Xli,Ci))  Stl11  holds'   Let 
W.=  U^.X^-.C.)  and  W*e  (X2i,xj.,C.)   and 

GtT(t)  =P(X14<  t.,X0.<  t0,C.<  t,)  =  E[I(W  <  t)]  where 


I(W  <  t)  = 


i    if  x1±<  t1,x2.<  t2,c.<  t3 

0     otherwise 


Now,  under  HQ,  for  the  entire  sample,  we  have 
it        d    '     ' 

(xli,x2i,ci)  =  (x2.,xli,ci) 

and  applying  an  apropriate  function  (and  Theorem  1.3.7  of 
Randies  and  Wolfe,  1979)  thus 

I(W   <  t)I(5.=  1)  =  I(W*  <  t)I(f(6.)  =  1)  . 
Taking  expectations,  it  follows  that 

E[I(W.<  t)I(6.=  1)]  =  E[I(W*<  t)I(f(5.)  =  1)]   . 
Now,  recalling  that  5i=  1  iff  f(5i)  =  1;  thus 
E[I(5i=  1)]  =  E[I(f(5.)  =  1)]  , 
and  it  follows  that 
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E[I(W  <  t)I(5  =  1)]    E[I(W.<  t)I(f(6.)  =  1)] 


E[I(5  -  1)] 


E[I(f(5.)  =  1)] 


This  shows  that  the  c . d . f .  of  V^  given  it  is  a  type  1  pair 
is  equal  to  the  c . d . f .  of  W.  given  it  is  a  type  1  pair  and 
thus  the  exchangeability  property  holds  within  the  type  1 
pairs  . 

Now,  by  defining  a  function 
f,(a,b,c)  =   min(b,c)  -  y   -  |min(a,c)  -  u|  and  applying 
Theorem  1.3.7  (Randies  and  Wolfe,  1979,  page  16)  it  follows 

Z  =  lX21  ~  v\     ~     lXll  ~  y  I 
min(X2^,C)  -  y   -   min(Xi^,C)  -  y | 

d        ,  ii'  i 

=   min(Xj,,C)  -  y   -  |min(X2^  ,C)  -  y| 

=  lXll  ~  v\     ~     lX21  "  u\     =    ~Z 
and  thus  by  Theorem  1.3.2  (Randies  and  Wolfe,  1979,  page 

14),  the  random  variable  Z  has  a  distribution  that  is 

symmetric  about  0.   The  proof  of  Lemma  3.2.1  follows 

directly  from  Theorem  2.4.6  (Randies  and  Wolfe,  1979,  page 

so).  a 

Lemma  3.2.2:   Under  H  ,  the  following  results  hold. 

a)   Conditional  on  the  fact  the  pair  is  type  2  or  3,  the 
random  variables  y-  and  C.  are  independent. 


b)   Conditonal  on  n  ,  T    has  the  same  null  distribution 
as  the  Wilcoxon  signed  rank  statistic. 
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Proof 


First,  it  will  be  shown  that  conditioning  on  the  n 

type  2  and  3  pairs  does  not  affect  the  exchangeability 

property.   Define  W.  ,  W.,Gw(t)  and  I(W.  <  t)  as  in  Lemma 

3.3.1.   Now  under  H  ,  for  the  entire  sample,  we  have 
ii        d    i     i 

(xli,x2i,ci)  -  (x2i,xli,ci) 

and  applying  an  appropriate  function  (and  Theorem  1.3.7  of 

Randies  and  Wolfe,  1979) 

d     * 
I(Wi  <  t)I(5ie(2,3))  =  I(W.  <  t)I(f (5i)e(2,3)). 

Taking  expectations,  it  follows  that 

E{l(Wi<t)I(5ie(2,3))}  -E{ I(W J<£ ) I( f ( 6 ±) e (2 , 3 ) ) } 

Recalling  that,  6±e(2,3)  iff  f(6i)e(2,3),  and  thus 

E[I(6ie(2,3))]  =  E[I(f (6i)e(2,3))] . 

It  follows  that 


E[I(W.<  t)I(6  e(2,3))]    E[I(W.<  t)I(f (6  .  )e(2,3))  ] 


E[I(6lE(2,3))] 


E[l(f(6.  )e(2,3))] 


Therefore,  conditional  on  the  pair  being  a  type  2  or  3,  the 
exchangeability  property  still  holds. 
Thus,  it  follows  that 

P(Yj  =  l.Cj  <  c)  =  P(xjj  <  Cj.X^j  >  Cj.Cj  <  c) 

=  PCX^  <  cj»xij  >  cj»cj  <  c)  =  P(Yj  =  O.Cj  <  c)  . 


Noting  that, 

P(Yj  =  l.Cj  <  c)  +  P(Yj  =  0,Cj  <  c)  =  P(Cj  <  c) 
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and  thus 

2P(Y;j  =  l.Cj  <  c)  =  P(Cj  <  c) 

or  that 

P(Yj  =  l.Cj  <  c)  =1/2P(C:j  <  c)  =  P(Yj  =  OP(Cj  <  c) 

and  thus  we  see  that  y a    anc*  C.  are  independent. 

To  prove  part  b),  let  y  =  (y      ,y    2  , . . . , Yn  +n  ) 

11  1   c 

and 


Q  =  (Q   ,,   ,0   10  ,...,Q   ,   )•   By  Theorem  2.3.3  (Randies 
-j      n,+l    n,+2        n. +n 
11  1   c 


and  Wolfe,  1979,  page  37),  Q  is  uniformly  distributed  over 

Rn   where 
c 

R    =  {q  :  q  is  a  permutation  of  the  integers  1,2, ...,n   } 


Now,  let  q  be  any  arbitrary  element  of  Rn   and  let  g  be  any 

c 

arbitrary  n   vector  of  0's  and  l's.   Thus, 

P(Y  =  g,Q  '    $)  -  P(x  "  §  )P(Q  ■  3  )   (by  part  a) 
and 

P(Y  =  §  )P(Q  -  ^  )  -  -jj-  x  — 

9  C    *  ! 
2       c 

which  proves  part  b).  Q 


By  Lemmas  3.2.1  and  3.2.2,  Tn  and  Tn   are 

1         c 

nonparamet rically  distribution-free  conditional  on  n^  and 
n  ,  respectively. 
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Lemma  3.2.3:   Under  HQ ,  the  following  results  hold 
a)   Conditional  on  n1  ,  E(Tn  )  =  n^nj  +  D/4  and 
Var(Tn  )  =  n1(n1+l)(2n1+l)/24  . 


b)   Conditional  on  nc ,  E(Tn  )  =  nc(nc+l)/4  and 

Var(Tn  )  =  nc(nc+l)(2nc+l)/24. 
c 


c)   Conditional  on  n,  and  n  ,  T    and  Tn   are 

1        c 

independent . 

Proof  : 

The  proof  of  parts  a)  and  b)  follow  directly  from 
Lemmas  3.2.1  and  3.2.2  and  the  fact  that  the  Wilcoxon  signed 
rank  statistic  based  on  a  sample  of  size  n  has  a  mean  of 
n(n+l)/4  and  variance  of  n( n+1 ) ( 2n+l ) / 24  . 

The  proof  of  part  c)  is  also  trivial  following  from  the 

fact  T    and  T_   are  based  on  sets  of  mutually  independent 
nl       nc 

□ 
observations  . 

With  these  preliminary  results  out  of  the  way,  the  test 

statistic  T„   „   can  now  be  defined  by 
nl'nc 


T       =  L,T    +  L9T„ 
nl'nc     l    nl     2  nc 

n.  n . +n 

=  L    I    f  R    +   L     I  y  Q   , 

1  i-1  X    X  L    j=n1+l   J  J 


where  L,  and  L-,  are  finite  constants 
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Theorem  3.2.4:   Under  HQ , 

a>   E(Tn,,n  )  =  LlE<Tn>  +  L2E<Tn  > 
l   c  1  c 

=  (L^Ci^  +  l)  +  L2nc(nc  +  l))/4 

b)   Var(Tn  >n  )  =  (Lfn1(n1+l)(2n1+l) 

+  L|nc(nc+l)(2nc+l))/24 


c)   T 


n   n   is  symmetrically  distributed  about  E(T      ) 


and 


d)   for  fixed  constants  L,  and  L0  ,  T 

1       2'   nlfnc 

nonparametrically  distribution-free. 


is 


Proof  : 

The  proof  of  parts  a)  and  b)  follow  directly  from 

Lemmas  3.2.2  and  3.2.3.   To  prove  part  c),  it  is  known  that 

the  Wilcoxon  signed  rank  statistic  is  symmetric  about  its 

mean.   Thus,  Tn   and  Tn   are  symmetric  about  E(T   )  and 
nl  nc       '  n -i  ' 

E(T   ),  respectively.   Since  T    and  T„   are  independent 
c  nl       nc 

(conditional  on  N,  =  n,  and  N   =  n  ),  the  symmetry  of  T 

l     1        c     c  '        *       J  n,,n 

follows . 

To  prove  part  d),  note  that 

P(Tnlfnc  =  k)  =  p(hTni  +  L2Tnc  =  k)  = 


I        P^T^-  k-kc  |  L2T    =  kc)P(L2Tn   =  k  )  = 


{kc} 


J,   P(LlTnl  =  k'kc)P(L2Tn   =  kc> 
{kc}        1  c 

where  {k  }  =  set  of  all  possible  values  of  L0T 
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Now  using  the  nonparamet r ically  distribution-free  property 

of  T    and  Tn   established  in  Lemmas  3.2.1  and  3.2.2,  it 

follows  that  for  fixed  L,  and  L0  ,  L,T„  and  L0T„   are  also 

I        l  '   1  n  i  I    n  _ 

1  c 

nonparamet rically  distribution-free.  D 


l'uc 


The  conditional  null  hypothesis  distribution  of  T 

1  '  ( 

can  be  obtained  using  the  fact  it  is  a  convolution  of  two 

Wilcoxon  signed  rank  test  statistics'  null  distributions. 

Thus,  for  fixed  Li  and  L2 ,  the  distribution  can  be  tabled. 

Tables  in  the  Appendix  1  give  the  critical  values  for  T 

with  L,  =  1  and  L2  =  1  for  n,  =  1,2,  ...,15  and 

nc  =  1,2,. ..,10  at  the  .01,  .025,  .05  and  .10  levels  of 

significance.   The  actual  a-levels  are  also  reported  for  the 

cut-offs  given.   The  decision  rule  for  the  test  is  to  reject 

H   if  the  calculated  test  statistic  is  greater  than  or  equal 

to  the  critical  value  given  in  the  table  at  the  desired 

level  of  significance.   A  two  tailed  test  (i.e.,  for  H  : 

o y    *  Oj)    could  be  performed  by  using  the  symmetrical 

property  of  the  null  hypothesis  distribution  and  the  table 

to  determine  the  lower  critical  value  for  the  test 

statistic. 

A  test  of  H   for  larger  n,  and  n   can  be  based  on  the 

asymptotic  distribution  of  T       which  will  be  presented  in 

nl»  c 

Section  3.4. 
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3.3    u,  =  Uo>  Unknown 

In  the  previous  section,  the  common  location  parameter 

was  assumed  to  be  known.   Generally,  this  is  not  the  case. 

More  often  we  may  assume  a  common  location  parameter,  but 

this  parameter  is  unknown.   This  section  will  present  a 

slight  modification  to  the  test  statistic  T_   n   to  be  used 

n1,nc 

in  these  settings.   The  modification  will  be  to  estimate  the 
common  location  parameter  using  a  "smoothed"  median 
estimator  based  on  the  product-limit  (Kaplan  Meier)  estimate 
of  the  survival  distribution  (Kaplan  and  Meier,  1958).   This 
estimated  location  parameter  M,  replaces  \i    in  the  previous 
definitions.   That  is,  define  the  variable  Zi  to  be 


*2i 


M|  -  |X1:L  -  M| 


i  =  l  ,2 , .  .  .  ,n  . 


The  definitons  of   T  ,  R  ,  y     ,     Q  ,  T   ,  T    and  T 

1     i     j     j     n^     nc        lljiuc 

remain  unchanged.   In  this  section,  the  statistic  will  be 

denoted  by  TM    n   to  identify  the  fact  the  location 

parameter  was  estimated  with  a  "smoothed"  median  estimator 

based  on  the  product-limit  estimate  of  the  survival 

distribution.   This  estimation  does  not  affect  the  results 

in  Section  3.2,  but  Lemmas  3.2.1  and  3.2.3  c)  must  be 

reproved,  since  in  the  proof  of  3.2.1,  we  utilized  the 

independence  of  the  Zi's,  a  condition  which  no  longer 

exists.   Also,  in  3.2.3  c)  T    and  T    were  based  on  sets  of 

1        c 


63 
mutually  independent  observations.   This  is  not  the  case  in 
the  current  context. 

First,  we  introduce  the  "smoothed"  median  estimator  and 
the  product-limit  estimate  of  the  survival  distribution. 

Let  (*(!). *(2)i'"»Y(2n1+n2+n3))  ^present  the  ordered 
uncensored  observations.   (This  ignores  the  fact  the 
original  observations  were  bivariate  pairs,  and  considers 
only  the  2n,+n2+no  uncensored  observations,  i.e.,  2n, 
components  belonging  to  type  1  pairs,  the  n£  uncensored 
components  of  type  2  pairs  and  the  n-,  uncensored  components 
of  type  3  pairs.)  That  is,  X . .  =  Y/,  \  if  X. .  is  uncensored 
and  Xi ^  has  rank  k  when  ranked  among  the  set  of  all 
uncensored  observations  from  either  (both)  components  of  the 
pairs  for  i  =  l,2  and  j=l,2,...,n.   Let  n(-j)> 
i  =  l  ,  2  ,  .  .  .  ,  2ni  +ti2+rii  ,  be  the  number  of  censored  and 
uncensored  observations  which  are  greater  than  or  equal  to 
Y^).   Thus, 

2   n 

n(i)  =  I       I     I(Xii  "  Y(i)}'    Where 
K    J         i  =  lj=l     J     v  ' 


I  is  the  indicator  function  which  takes  on  a  value  of  one 
when  the  argument  is  true  and  zero  otherwise. 

The  product-limit  estimate  of  the  survival  distribution 
is  defined  as 
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S(t)  = 


if  t  <  Y 


(1) 


n  <*»,,  ,-  D/n 


k  =  l 


(k) 


(k) 


if  Y 


j=l  ,2  ,  .  .  .  ,  2n1+n-+n  -1 


if  t  >  Y 


(2n1+n2+n  ) 


(^te,  that  Y(1)  i8    the  smallest  uncensored  observatiQn  ^ 
Y(2n1+n2+n3)  i»  the  largest  uncensored  observation.) 
The  definition  given  here  assu.es  no  ties  in  the  uncensored 
observations.   This  is  valid  under  assumptions  A2  and  A3. 
Using  the  above  definition,  the  "smoothed"  median  estimator 
M  is 


M  =   < 


m1  + 


SCm^  -  0.5 


if  njj  =  m2 


where 


and 


ml  =  n>in{Y(i):  S(Y(±))  >  V2  } 


'2  =  raax^(i):  s(Y(±))  <  V2} 


A  brief  explanation  of  this  estimator  follows. 

The  product-limit  estimate  of  the  survival  function, 
SCt).  is  a  right  continuous  step  functlon  which  has  jumps  ^ 
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the  uncensored  observations.   An  intuitive  estimate  for  the 
common  median  is  the  value  of  ¥/.■>  such  that  S(Y/jO  =  V^. 
which  often  does  not  exist  due  to  the  nature  of  S(t).   Thus, 
the  "smoothed"  estimator  was  suggested  by  Miller  (1981,  pg. 
75),  which  can  be  viewed  as  a  linear  interpolation  between 


m,  and  n^  .   If  t 


he  Y/jn  exists,  such  that  S(Y/iJ  =  V2  .  then 


m,  =  n^  and  M  is  that  value  of  Y/.\  by  definition. 

Lemma  3.3.1:   The  statistic  M  is  a  symmetric  function  of  the 
sample  observations. 


Proof  : 

ft       ft  ft 

Let  ( Y/ ,  \ , Y/ 2 \ ,  •  •  • , Y/ 2n) )  represent  the  ordered  2n 

ft       ft  ft 

observations  where  Y/  ,  ■,    <  Y/j)  <  •••  <  ^(2     )'       This  again  is 

ignoring  the  fact  that  the  original  observations  consisted 

of  n  bivariate  pairs  and  treats  the  sample  as  if  it 

consisted  of  2n  observations  (some  of  which  are  censored). 

Under  assumption  A2,  there  are  no  ties  among  the  uncensored 

observations.   Similarly,  by  assumptions  A2  and  A3,  there 

are  no  ties  between  an  uncensored  and  a  censored 

observation,  although  there  may  be  ties  (of  size  two)  among 

the  censored  observations  because  type  4  pairs  contribute 

two  components  with  the  same  value.   The  product -limit 

estimator  S(t)  can  be  viewed  as  a  function  of  the  vectors 

Aft  ft 

(Y(l)'Y(2)""'Y(2n))     and     (I(1)'I(2)'-'  "I(2n))    where 
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(j) 


1    if  Y,.,  is  censored 
(j  ) 

0    otherwise 


in  the  fact  that 


2   n 


n,  .  n  =    I       I     KX.  .>  Y,  ..) 


=  2n  +  1  -  (rank  of  Y(i)  in  ( Y( { j , Y( 2 j , . . . , Y( 2n) ) 


In  addition,  S(t)  can  be  expressed  as 


S(t)  = 


n 

Y, ,x<t 


t  <  mini Y  ,  .  v  :  I,  .  N  =  1 
1  ( i)    d  ) 


t  >    max  Y,  .  ,  :  1,.,=  1 
L  (i )    d  ) 


2n  -  j    "(j) 
2n  -  j  +  1J 


otherwise 


L   (J) 


Thus,  S(t)  is  a  symmetric  function  with  respect  to  the 
sample  observations  and  therefore  M,  being  a  function  of 
S(t),  is  also.  □ 

Lemma  3.3.2:   Conditional  on  n, ,  T    has  the  same  null 


1  »  xn 


distribution  as  the  Wilcoxon  signed  rank,  statistic. 


Proof  : 


Let  ¥  =  (<F  ,  V     ,     ...,Y   },  where  <F   =  f(Z.)  and 


R  =  Ir,  ,  R_  .....  R  }  with  R.  =  absolute  rank  of  Z  .  . 

Let  "P   be  any  arbitrary  element  of 

o       J  J 

P  =  {  Y   :     V       is  a  1  x  n.  vector  of  O's  and  l'sl, 
1  -o    -o  1  ' 
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n, 
(of  which  there  are  2    different  elements),  and  let  r  be 

any  arbitrary  element  of 

R  =  {r  :   r  is  a  permutation  of  the  integers  l,2,...,n  }. 

Now,  under  the  null  hypothesis, 


i    i 


d    '     ' 

(x11,x21,ci)  =  (x2.,xli,ci) 


and  thus  letting  X,.  =  min(X, . ,C, )  and  X2-  =  min( X2i , C . ) ,  it 
follows  that 


(Xli'X2i>  =  (X2i'Xli)  • 


for  i=l,2,...,ni  and  these  pairs  are  also  exchangeable. 


Now,  let   k  = 


be  an  operator  such  that 


(xu,  x2i) 


(Xli '  X2i}    if  k  =  1 
(X2i '  Xli)    if  k  "  ° 


Thus,  under  HQ  and  using  the  exchangeability  property,  it 
follows 


(xn,  x21),  (x12,  x22),...,  (xln^,  x2n^)} 


=  l(Xlr,'  X2r,)  ''  (Xlr  '  X2rJ  2>'".^lr   ,  X2r   )  nl}  . 


1      1 


2      2 


nl      nl 


(3.3.1) 
Recalling  that  M  =  the  estimate  of  the  location  parameter, 
is  a  symmetric  function  of  the  components  of  the  observation 
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pairs  from  Lemma  3.3.1  and  defining  a  function 


fi(yi>y2}  =  1*1  -  m|  -  |y2  -  Ml  =  z    » 


it  follows  from  applying  this  function  to  (3.3.1)  that 


Zj,  z2,...,  z( 


r  '   r  »*"•»   j- 
1     2         i 


where  (Z   ) 

i 


=  {(zr  )\  (zr  Z2,...,  <z   )kni 

1  2  n 

f  Z  if  k  =  1 
ki     I   ri 

1  -Zr  if  k  =  0 


(3.3.2) 


Now  defining  a  function  f~(Z)  =  ( lF  ,  R),  where  '?  and  R  are 
1  x  n,  vectors  such  that 


V  < 


and 


if  Z.  >  0 
J 

if  Z.  <  0 
J 


R.  =  absolute  rank  of  Z.,  i.e.,  rank  of   Z.   amonj 
C|Z1 I .  |Zo|,...,   |zn  1} 


for  j =1 , 2 , . . . , n, .   Applying  this  function  to  (3.3.2)  it 
follows  that 

1    2.  n .     1    l  n . 


=  (f   ,  *   ,...,¥    ,R   ,R   ,...,R    ) 
rl    r2         rni    rl    r2         rnx 

-  {(%  )  l,  (¥   )  2,...,(^r   )  nl,  Rr  ,  Rr  ,...,R, 
1         2  n,  1     2 


where   ( ¥   )  1= 
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if  k.=  1 


1  -  <P    if  k  =  0 
r  .      i 

l 


Now  since   k  and   r  were  arbitrary  vectors,  it  follows  that 

P(Y  =  |  ,  R  «   R  )  =  P(|  =  T0,  R  =  r)  = 

1       1 

nl 
2      ni! 

Thus  noting  this  produces  the  same  null  distribution  for  the 
Wilcoxon  signed  rank  statistic,  the  proof  is  complete.     Q 

Lemma  3.3.3:   Conditional  on  n,  and  n  ,  T    and  T    are 

1         c 

independent . 


Proof  : 

This  proof  is  done  in  a  series  of  steps  which  are 
stated  as  Claim  1  to  Claim  7  in  an  attempt  to  avoid 
confusion . 

Let  y.     be  defined  as  before  and  let  (x. , c • )  denote  the 
observed  value  of  the  i    type  2  or  3  pair 

i=n, +1 , . . . , n , +n  .   Note,  one  component  was  censored,  and 
thus  its  observed  value  was  c.  while  the  other  component  was 
uncensored  and  its  value  is  denoted  by  x. .   This  is  not 
specifying  which  component  (X,.  or  X„ . )  was  censored. 
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Claim  1 :   y-  is  independent  of  (x.,c.). 

This  follows  by  noting  that  under  H   and  using  the 
exchangeability  property  of  type  2  and  3  pairs  (as  was 
shown  in  Lemma  3.2.2)  that 

P{Yi-l|(xi,c1)}   =  P{X1.=x.,X2i=ci| (x. ,Ci)} 
=  P{Xli=ci,X2i=xi| (xi,ci)}  =  P{Yi=0| (xi,ci)} 

Since  P{yi  =  l  |  (x^^)}  +  P{  Yi  =  0  |  (  x±  ,  c±  )  }  =  1,  Claim  1 

follows.   Now  define  y  =  (yn  +  1  ,  Yn  +2',,,»^n  +n  )• 

1       1  1   c 


Claim  2:   y  is  a  vector  of  n   i.i.d.  Bernoulli  random 
~  c 


variables  which  are  independent  of 

{  (xni  +  l  'cni  +  l^  '  (xn1+2'cn1  +  2)  '  *  '  *  '  (  Xn1+n(,  »  Cn1+n(; )  } 

This  follows  from  Claim  1  and  the  fact  that 
{(xn1+l>cn1+l)'(xn1+2>cn1+2)'---'(xn1+nc>cn1+nc)}  are  *  *  *  '  d 

Claim  3 :   y  ^s  independent  of 


{(x 


nx  +  l  »cnj  +  l'  ,(xn1+2'cn1  +  2)  '  '  *  "  '  (  Xn1+n(;  '  Cn1+n(.  )  } 


x„   and  x„   where 
~n]_  -n4 


~n:  =  {(xll'x12)'(x12'x22)."-.(xln1»x2n1 


)} 


and 


xn,  =  ^cn,+n  +l'cn,+n  +1  >»•••.<  cn  '  cn  }  } 
t  1   c      1   c 

(i.e.,  the  observed  totally  uncensored  type  1  pairs  and 
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the  observed  totally  censored  type  4  pairs, 
respectively  )  . 

This  follows  from  Claim  2  and  the  fact  y  is  a  function  of 
the  type  2  and  3  pairs  only. 

Claim  4  :   y  is  independent  of  x   ,  x   ,  X/   \    and  C/   \ 


where  x 


(nc) 


denotes  the  observed  ordered  uncensored 


members  of  type  2  and  3  pairs  and  c/   \  denotes 

c 

the  observed  ordered  censored  members  of  type  2  and  3 
pairs  . 

Note,  this  claim  follows  directly  from  Claim  3  and  the  fact 


that  x 


X(n„) 


and  c 


CO 


are  functions  of 


^(xn1+l'cn1+l)'(xn1+2'cn1+2)"--'(xn1+nc'cn1+nc)>  only* 


Claim  5  :   y   is  independent  of  xn  >  xn  >  x(n  )  an<i  £(n  \ 


where  y „  =  {y„    ,y„ 

Ic    wc(1)'Yc(2) 


'Yc(n  )}'  C^)  1S  the  1 


th 


element  of  c 


/   \  and  y      is  the  y  which  corresponds 


to  the  pair  of  which  c 


(i) 


was  a  member . 


This  claim  follows  from  Theorem  1.3.5  of  Randies  and  Wolfe 

(1979)  and  since  y   is  a  fixed  permutation  of  y.   Note  that 

the  i.i.d.  property  still  holds  for  the  y     's. 

(i) 
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Claim  6:   Given  x_  ,  x„  ,  x/   \  and  c 


ll 


x(n  )  and  c(n  )'  Tn,  is  no  longer 


a  random  variable;  that  is,  the  value  of  T    is 

nl 

observed . 


This  follows  directly  from  the  definition  of  T 

nl 


Claim  7 :   Note,  that 


n.+  n         n, +  n 
1    c         1 


c   j-ttj+1  J  J    j-t^+1    c(j) 


which  shows  that  T_   is  a  function  of  y   and  is 
nc  -  c 

independent  of  x_  ,  x„  ,  x/   \  and  c,       \. 

~n\       ~n4   -^p  ~knc; 


Thus,  Tn   has  a  null  distribution  equivalent  to  the  Wilcoxon 
c 

signed  rank  null  distribution  and  is  independent  of  T 

nl 

which  is  a  function  of  xn  ,  xQ  ,  x/n  s    and  C/_  ■>  only.      □ 

~1   ~4   ~   c  c 

With  the  proof  of  Lemma  3.3.3,  Theorem   3.2.4  is  valid 

for  the  modified  test  statistic  TM„   „  .   That  is,  under  H 

npnc  '         o 

and  conditional  on  n,  and  n  ,  TM       has  the  same 

1       c     nl »nc 

distributional  properties  stated  in  Theorem  3.2.4  for  T 

nl  'nc 
and  the  tables  in  the  appendix  are  valid. 
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3.4   Asymptotic  Properties 

In  this  section,  the  asymptotic  distribution  of  the 

test  statistic  T    „   (and  TMn   _  )  under  H   will  be 
n1,nc  nl'nc  ° 

established.   The  asymptotic  normality  of  the  test  statistic 

will  be  presented  first,  conditional  on  Nj  =  n1  and  N£  =  nQ 

both  tending  to  infinity  and  second,  conditional  on  n 

tending  to  infinity.   In  the  second  case,  this  is  the 

unconditional  asymptotic  distribution  since  it  only  requires 

that  the  sample  size  go  to  infinity.   Note  that,  under 

assumption  A. 5  (A. 5  stated  that  the  probability  of  a  type  4 

pair  is  less  than  one),  as  n  >  °°,  N^  +  Nc  =  (n  -  number  of 

type  4  pairs)  ♦  «  also.   The  asymptotics  will  be  presented 

for  the  test  T       only.   In  the  previous  section,  it  was 
n1,nc 

shown  that  under  H   and  conditional  on  N^  =  n^  and  Nc  =  nc , 

T       and  TM       have  the  same  null  distribution;  that  is 

nl'nc  .      nl'nc 
a 
T       =  TM      .   Therefore,  they  have  the  same  cumulative 
n1,nc      nlsnc 

distribution  function  and  thus  their  asymptotic 
distributions  are  the  same.   There  is  no  need  to  prove  them 
separately . 

Theorem  3.4.1:   Conditional  on  N1  =  n^    and  Nc  =  nc,  under  HQ 


T       -  E(T      ) 
nl'nc       nl'nc    _jl_ 

a(Tn   n  > 
nl'nc 


N(0,1)  as  n,  >  °°  and  n  + 
'         1  c 


where 
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E(Tn       _     )     =     (L^.Cn.+l)    +    L,n    fn    +l))/4 


l'"c 


'1UPU1 


2llc^uc 


and 


''V'c'    =    IV«(Tniinc)]^ 


=     [(Lin1(n1+l)(2n1  +  l)    +    L2nc(  nc  +  l  )  (  2nc+l  ) ) /24  ]ly2 


Proof : 


First,  it  will  be  shown  that  T   and  T    have  asymptotic 

nl      nc 

normal  distributions.   Without  loss  of  generality,  it  will 

be  assumed  that  y  =  0 . 

Note  that 

nl 

I     V.R.  = 
i  =  l  X  -1 

I   H|x2.|    -    |Xll|)   +     I      I       H|x2i|    -    |Xli|    +    |x2i|    -    |xn|) 

i  =  l  Ki<j<n.  ~ 

J       1 


= -«lCD1>ni)    +    (2     )(U2>ni) 


where 


and 


Ul,ni     =-n7"J1     H\X2l\     ~     lXlil> 

U2,ni     =    -~-      11  *(|X21|     "     |XU|     + 

1  ,     1,     Ki<j  <ni 


X2j I     lXljP 


are  two  U-statistics  (Randies  and  Wolfe,  1979,  page  83).   It 
follows 

(       ^/2  ,       ^3/2 

(n, )  (n  ) 

rT_   -  n,(n,+  l)/4l  =  —   (u.     -  E(U,     )1 

>  n  l  v     1  ,  n  l  1  ,  n  l     > 


n      n  i      '    1 


+  (nl)1/2^U2,ni-  E(U2,ni))   ' 
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Now  notice,  0  <  U.     <  1  and  under  H. ,  E(U,  „  )  = 
1 , n  o       i , n^ 

P{(|x2i|  -  |x1±|)  >  0}  =  V2,  so  that  |u1  n   -  V2|  <  V2 
Therefore , 


(nL) 


3/2 


Ul,n,-'^  ' 


(n1  ) 

-, rr-  >  0  as  n  • 

n. (n   -  1 )  1 


Thus  , f  T 


n   -  ni(n1+  l)/4)  and  U^O^  -  V2) 


have 


the  same  limiting  distribution  as   n  ■*■  °°    . 

By  Theorem  3.3.13  of  Randies  and  Wolfe  (1979),  it  is 
seen  that  (n^^U     -  V^  has  a  limiting  normal  distribution 
with  mean  0  and  variance  r  E,  ,  (provided  E,  ,  >  0) 
whe  re 

_2 


5j  =   2  {E[<K|X2i|  -  |XU|  +  |x2j 


Xljl> 


Thus  , 


Note  that, 


x  f(|x21|  -  |xlt|  +  |x2k|  -  |xlk|)]  -  i/4  } 


=  1/3 


T    -  n.  (n.+  l)/4 
n .      11 


1  lh 


N(0,1) 


3n, 


1  ±h 


3n, 


n  (n  +  l)(2n.+  1)  , 
_ . i Y'2 


P 
— +  1 


24 
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as   n  ■*■    °°  .   Therefore  (after  applying  Slutsky's  Theorem 
(Theorem  3.2.8,  Randies  and  Wolfe,  1979) 


T    -  n. (n.+  l)/4 
nl      l       1 d_ 

o(T   ) 
nl 


N(0,1)  . 


S  imilarly , 


n ,  +n 
1   c 


n ,  +n 
1   c 


*  .,  ^  * 


j=n  +1        n  +l<j<k<n  +n 


Tn   ".  I     ..  YJQj 
c   J-n  +1 

I  <Yj*<cj  ~  °k)  +  Yk*(ck  "  cj)} 


=  nc<U3,n  >  +  (2  )<D4,»  >  ' 


where 


and 

U 


4,n, 


'3,n, 


—      I         I 
C\     n. +1 < j <k<n,  +n 
,     1    "     1   c 


n  ,  +n 
1   c 


c  j-n  +1 


YjfCCj  "  ck)  +  Yk^(ck  -  Cj)} 


are  two  U-s ta t is t i cs .   It  follows  that 


i       ^1/2 
(n  ) 

—  (T    -  n  (n  +  l)/4 

n    v  n      c   c 


(n  .) 


3/2 


U~     -  E(U,    ) 
3  ,  n         3  ,  n 


<"c>1/2<D4;n    "    E(U4,„    »        • 
c  c 


11 


Note    that    0    <    U0  <    1,    so     lu,    _       -  V?!     <   V?  and    thu.< 


(n    ) 


3/2 


(n    ) 


3/2 


-^(U,         -  1/2)     <  ,       C       . 

n  ^     3  ,  n  &  n    ( n       -1 


0    as    n    ->-    co    . 


U  jk 


Thus  , 


T  -    n     (n    +    l)/4)     and     (n     )  '\  U .  -   Vol     hav 

n  cc  ;  c       v     4  ,  n  z; 


the    same    limiting    distribution    as    n    +    <*>    . 

c 

Again  applying  Theorem  3.3.13  of  Randies  and  Wolfe 

(1979),  it  is  seen  that  (n  )  Zfu.    -  Vol  has  a  limiting 

c   v  h  ,  n    *•-' 

c  2 

normal  distribution  with  mean  0  and  variance  r  £i 

2 
(provided   r  £i>0  ),  where 


r2Cx    =    22{E[(Yj?(C.    -    Ck)    +    YkT(Ck    -    C. ) 


x        Y 


jMCj   -  c.)  +  Yinci  -  C.))]   -  \  } 


22{E(Y.1'(C.-Ck)    +    YknCk-Cj))(YjT(Cj-Ci)    +    y^lC.-C.))     -  V4} 


By    the    independence    of     y-    and    C-     (Lemma     3.2.2), 


rV     =    22{P(Y,=    DP(C    >    C,    ,     C    >    C     ) 


+  P(y.=   i)p(y.=   i)p(c>   c,  ,    o   c.) 

J  1  j  k'       1  j 


+    P(yk=    l)P(Yj=    l)P(Cfc>    C     ,     C    >    C.) 


p(Yk=   i)p(y±-   i)P(ck>  cj}   c.>   c.)   -V4} 


=    4    {Vo  P  ( c .  >    C.  ,    C  .  >    C  .  )    +  Va  p  ( c .  >    c,   ,    c>    C  .  ) 


j  k'       j  i 


J         k'      1         j 


+  V4p(ck>   C    ,    Cj>   c.)   +V4p(ck>   C.,    C.>   Cj)   -V41 


^  , r  1  ,  1  r  1   ,   1   .   1  ^      1,    .  1 

-    4{    6    +    4^    6    +    6    +    3    J    "    4}     "    3         ' 


Thus  , 


T         -    n    (n    +    1  )/4 
n  c       c 


2    J       l3n    J 


-+    N(0,1)     .       Noting    that 


1  ,v, 


3n 


n    (n    +    l)(2n   +    1)  h 
c       c c nx/2 


— ■*■    1    as    n    -v 
c 


24 


and  applying  Slutsky's  Theorem,  it  follows  that 

T    -  n  (n  +  1  )/4 
n      c   c  , 

£ -*  N(0,1)  . 

o(T   ) 
n 
c 

The     conclusion    of     Theorem    3.4.1     then    follows     by    writing 


T  -    E(T  )  (L.T  +    L„T        )     -     (L.E(T       )    +    L„E(T        )) 

n,  ,n  n,  ,n  In,  2    n  In,  2  n 

lclc  1  c  1  c 


o(T  ) 

nl'nc 


(L2     a2(T       )    +    12     o2(T       ))72 
1  n,  2  n 

1  c 


L.a(T       )  T         -    E(T       )  L„a(T       )  T         -    E(T       ) 

In,  n,  n,  zn  n  n 

1  1  1  c  c  c 


n    ,n  n 


o(T  )  a(T       ) 

n ,  ,  n  n 

1       c  c 
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applying  Slutsky's  Theorem  and  utilizing  the  fact  Tn   and 

T    are  independent,  conditional  on  Nj  =  n^  and  Nc  =  n^.        n 
c 


Next,  and  most  importantly,  the  unconditional 

asymptotic  normality  of  T       will  be  established  as  n 

1  •  c 

tends  to  infinity  in  Theorem  3.4.4.   Prior  to  proving  this, 
several  preliminary  results  will  be  stated  which  are 
necessary.   These  preliminary  results  which  are  stated  in 
Lemmas  3.4.2  and  3.4.3,  were  proved  by  Popovich  (1983)  and 
thus  will  be  stated  without  proof.   Minor  notational  changes 
are  made  in  the  restatement  of  his  results  to  accommodate 
the  notation  in  this  dissertation. 

The  first  preliminary  result,  Lemma  3.4.2,  is  a 
generalization  of  Theorem  1  of  Anscorabe  (1952). 


Lemma  3.4.2:   Let   {T    „   }  for  n,=l,2,...,  n  =1,2,...,  be 
1  '  c 


any  array  of  random  variables  satisfying  conditions  (i)  and 

(ii). 

Condition  (i):   There  exists  a  real  number  y,  an  array 

of  positive  numbers   {co       }  and  a  distribution  function 

F  (  •  )  such  that 

lim        P{T       -  y  <  x  a  }  =  F(x) 

.     ,  .       n.,n  n.,n 

min(n, ,n  )+*      1   c  1   c 

1   c 

at  every  continuity  point  of  F(«). 

Condition  (ii):   Given  any  e  >  0  and  n  >  0 ,  there 
exists  v  =  v ( e  ,  n )  and  d  =  d ( e  ,  n )   such  that  whenever 
min(n,,n  )  >  v,  then 
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P{  T  ,   ,-T         <eo)       for  all  n'  n'  such  that 
1  n ,  ,  n      n  ,  n  '       n ,  ,  n  1   c 


1   c 


1  '  c 


1  '  c 


n.'  -n.   <  dn,  ,   n'  -n    <dn}  >  1  -  n  . 
1     II      l'c     c  l      c 


Let  {n  }  be  an  increasing  sequence  of  positive  integers 
tending  to  infinity  and  let  {N.}  and  {Ncr>  be  random 
variables  taking  on  positive  integer  values  such  that 


N.      p 
lr    r 


+  X.  as  r  *    » ,  for  some  X.     such  that  0<X,<1, 
l  '  l  i 


1*1, c.   Then  at  every  continuity  point  x  of  F(«) 


lira  P{TM    „    -  y  <  xojr,     ,  ,,     ,}  =  F(x) 
Nlr,Ncr  UiO.lX.nJ 


1  r     c  c 


where  [a]  denotes  the  greatest  integer  less  than  or  equal  to 


Proof 


This  is  Lemma  3.3.1  in  Popovich  (1983). 


a 


The  last  preliminary  result  necessary  is  a  result  of 

Sproule  (1974)  which  is  also  stated  in  Popovich  (1983)  as 

Lemma  3.3.3.   It  can  be  viewed  as  the  extension  of  the  well 

known  one  sample  U-statistic  Theorem  (Hoeffding,  1948)  but 
with  the  sample  size  as  a  random  variable. 
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Lemma    3.4.3:       Suppose    that 


nvrli  f(x  ,  x  ,...,  x  )  , 


where  B  is  the  set  of  all  subsets  of  r  integers  chosen 
without  replacement  from  the  set  of  integers  {l,2,...,n}  and 
f(t,,  t«,...,  t  )  is  some  function  symmetric  in  its  r 
arguments.   This  U   is  a  U-statistic  of  degree  r  with  a 
symmetric  kernel  f(«).   Let  {n  }  be  an  increasing  sequence 
of  positive  integers  tending  to  infinity  as  r  >  °°  and  {N  } 


be  a  sequence  of  random  variables  taking  on  positive  integer 

1'  X2'  ••»Xr 


2 
values  with  probability  one.   If   E{f(X  ,  X  ,...,X  )}   < 


V.  2  Nr       P 

lira    Var(n2  U     )     =    r    f,>    0,     and    — -   ■»•     1     ,     then 

n         1  n 


lim  P{(UN   -  E(UN  ))  <i    Nr  ^(r2^)172  }  =  J(x)  , 
r->-°°       r        r 
where  0 ( • )  represents  the  c.d.f.  of  a  standard  normal  random 

variable . 


Proof  :   This  is  Lemma  3.3.3  in  Popovich  (1983). 


a 


One  comment  is  needed  about  this  result.   The  proof  of  this 
lemma  follows  as  a  result  of  verifying  that  conditions  C, 
and  C2  of  Anscombe  (1952)  are  valid  and  applying  Theorem  1 
of  Anscombe  (1952).   Condition  C,  is  valid  under  the  null 
hypothesis  and  the  verification  of  condition  C2  is  contained 
in  the  proof  of  Theorem  6  by  Sproule  (1974).   This  condition 
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C?  will  be  utilized  in  the  proof  of  the  major  theorem  of 
this  section  which  follows. 


Theorem  3.4.4:   Under  H  , 


V.N   '  E(TN,,N  >    . 

i £ - -  +  N(0,1)   as  n  > 

1   c 


Proof : 

The  proof  which  follows  is  very  similar  to  the  proof  of 
Theorem  3.3.4  in  Popovich  (1983). 

T       -  E(T      ) 
.  n  .  ,  n         n .  ,  n 

Let  T       = —  ,  the  standardized 

nl'nc         o(T       ) 

nl'nc 

T       statistic.   Theorem  3.4.1  shows  that  { T    _  }  for 
nl»nc  nl»nc 

11,-1,2,  ....  n  =1,2,...,  satisfies  condition  (i)  of  Lemma 

3.4.2  with  y    =    0    amd  co       =1.   Note  that  from  assumption 

nl  '  nc 

A5,  it  can  be  seen  that  X.  >  0  for  at  least  one  i=l,c.   If 
X.  =  0,  for  i=l  or  i=c,  then  Theorem  3.4.4  follows  directly 
from  Theorem  1  of  Anscombe  (1952)  and  Lemma  3.4.3.   Thus,  it 
will  be  assumed  that   Xi  >  0  for  i=l,c.   The  proof  of 
Theorem  3.4.4  follows  if  it  can  be  shown  that  condition  (ii) 
of  Theorem  3.4.2  is  satisfied. 

T    -  E(T   ) 

*      nl       nl 
Let  T    =  ,  the  standardized  Tn 

nl      a(T   )  1 

nl 

statistic.   In  the  proof  of  Theorem  3.4.1,  it  was  shown  that 


83 

T    has  a  limiting  standard  normal  distribution  by  utilizing 
nl 

the  U-statistic  representation  of  T„  .   As  a  result  of  Lemma 

nl 

3.4.3  and  this  U-statistic  representation,  it  follows  that 

T    satisfies  condition  C,  of  Anscombe  (1952)  (since  T    is 
nl  2  nl 

equivalent  to  a  U-statistic  which  satisfies  condition  C2  of 
Anscombe  (1952)  as  proved  by  Sproule  (1974)).   This 
condition  C2  can  be  stated  as  follows. 

Condition  C-?:   for  a  given  e  >  0  and  n  >  0 ,   there 
exists  v.  and  d  >  0  such  that  for  any  n  >  v. 


1**1  1 

P{  T   -  T  ,   <  e  for  all  n.'  such  that   n  J  - 
1  n ,    n  1  1  '1 


<  d^j}  > 


1  -  n  • 


(3.4.1) 


Similarly,  as  a  result  of  the  U-statistic  representation  of 

T    (as  shown  in  the  proof  of  Theorem  3.4.1)  and  from  Lemma 
c 


3.4.3,  it  follows  that  T 


T    -  E(T   ) 


o(T   ) 

n 


satisfies 


condition  C2  of  Anscombe  (1952).   That  is,  for  a 

given  e„>  0  and  r\    >  0  ,  there  exists  Vo  and  d2>  0  such  that 

for  any  n  >  v „ 
c    2 


P{  T 


T  ,   <  e„  for  all  n'  such  that   n' 
n  l      2  c  1  c 


<  d„n  }  > 
I    c 


1  "  n  . 


(3.4.2) 
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Consider 


T       -  E(T       ) 
1   c        1   c 


1'  c       a(T      ) 
nl'nc 


Ll3(In'       ,      L2o(Tn  } 
L_  <  (T   )  + r__  <  fT* 


a(T      ) 
nl'ac 


1     a(T      ) 
nl'nc 


In1-  n ,  ;     2nv  n 
1  c 


Note  that, 

(1)   L,   and  L0„  are  functions  only  of  N.  and  N   and  the 
In       zn  '       l       c 


given  L,  and  Lj  constants. 

(2)   (L;n)2  +  (L;n)2  =  i. 

(3)   There  exists  constants  L,  and  L9  such  that  L 


In 


and  L 


2n 


L.   as  n 


First,  it  will  be  shown  that  condition  (ii)  is 

satisfied  for  L,fT   1  +  L„fT   )  =  T 

1  ^  n,  ;     2vn;  n.,n 

1          c  1   c 

Let  e  >  0  and  n  >  0  be  given  and  let  v^,  v ^  »  ^i »  ^2 
satisify  (3.4.1)  and  (3.4.2).   Let   v  =  max(v1,  v2)  and 
d  =  min(di,  dj ) •   Now, 


P{     T  , 


T        <  2e  for  all  n'   n'  such  that 


n!  -  n.   <  dn.  ,i=l,c} 
l     l  i       l 


>  pul;i<. 


*  i     *  i  *  ' 

T  +  L„  T  ,  -  T 

n  l      2  l  n  i 
1           c 


<  2e  for  all  n '   n' 
1    c 


such  that   n!  -  n .   <  dn.,  i  =  1  ,  c } 
I  1     1  I       l 


>    P{L,|t  ,  -  T   I  <  e  and  L„ I T  ,-  T   I  <  e  for  all  n'   n1 
1  I  n      n,i  2  I  n     n  I  1    c 

1      1  c     c 


such    that       n '.    -    n.       <    dn.,     i  =  l,c} 
I     i  i  I  i 


'         ic  it     I 

=    P{L,     T     ,     -    T  <    e    for    all    n.1  such    that       n  J     -    n.        <    dn.} 
1  I     n               n     I                                          1  I     1  l1  1 

+    P{L„    T    .     -    T  <    e    for    all    n1  such    that       n'     -    n         <    dn    } 
2inni                                       c  ice'  c 


-    P{L.|T     ,    -    T       I     <    e     or    L„ I T     ,-    T        I     <    e     for    all    n'        n' 
1  '     n ,  n'  2'n'  n     <  lc 

11  c  c 


such    that       n .'     -    n,        <    dn,     and       n'     -    n         <    dn    } 
II  II  1  I     c  c  I  c 


>    P{L,     T    ,     -    T  <    e    for    all    n,'  such    that       n ,'     -    n,        <    dn,} 

lln'            n     I  1                                11             II                1 

'       *  * 

+    P{L„    T    ,     -    T  <    e    for    all    n'  such    that       n '     -    n         <    dn    }    -    1 

2|nnl  c                               'cc1               c 
c               c 

(3.4.3) 


Now  using  inequalities  (3.4.1)  and  (3.4.2)  and  applying  them 

i  _  1/        t  _  1/ 
to  (3.4.3)  with  e  =  min{e  (L  )    2  f     e  (L  )   '2}  then 


T  ,   ,  -  T        <  2s  for  all  n'   n1  such  that 
'n.'.n'     n,,n'  l'c 

1   c      1   c 


ni  "  ni I  <  d 


ni,  1-1, c}  >    (1  -  n)  +  (1  -  n)  -1  -  1  "  2n 


8  6 

Therefore  T       satisifies  conditon  (ii)  of  Lemma  3.4.2  so 
ni  ,  n. 


1  'llc 
that  Theorem  3.4.4  is  valid  for   T 


»  *      '  * 

=  L,T„   +  L0T 


2\' 


ni>nc   _1  n: 

To  see  that  the  Theorem  is  valid  if  L,  and  L^  are  replaced 

T  I 

ky  L|n  anc*  ^2n,     respectively,  consider, 


-  T. 


+  L 


2n  An, 


-  L,L   +  L?T 

1     z  nc 


1  n 


=  (L 


In 


Ll)T    +  (L 


2n 


Lo)T, 


(3.4.4) 


X  X 

Now,  since  T     and  T    converge  in  distribution  to  standard 

X  X 

normal  random  variables,  T     and  T    are  "0(1)  (Serfling, 

'   P  ° 
(1980),  pg.  8).   Also,  since  L,   — ■»•  L,  and 

i    P    i  it  ii 

L„   >  L„  as  n  *  « ,  (Lln  -  Ll)     and  (L2n  -  L2)  are  0  (1). 

Therefore  (3.4.4)  shows  that 

(Ljn  -  l|)T*   +  (L^   -  L2)Tn   is  °p ^ l ^    and  thus  '  Theorera 
3.4.4  is  valid.  ^ 


3 . 5   Comment  s 


From  the  results  in  Sections  3.2,  3. 3, and  3.4,  it  is 
clear  that  a  distribution-free  test  of  the  null  hypothesis 
of  bivariate  symmetry  versus  the  alternatives  presented 
could  be  based  on  T        (or  TM      ).   For  small  samples, 


1  >uc 


ll  'lic 


an  exact  test  utilizing  the  distribution  of  T    Q   (and 

1  '  c 


TM 


_  )  conditional  on  N,  =  n,   and  N   =  n   coul 
n1  ,n(;/  11         c     c 


d  be 
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performed.   For  larger  samples,  the  asymptotic  normality  of 

T       (and  TM„   „  )  could  be  used.   In  Chapter  Five,  a 
n1,nc         ni,nc 

Monte  Carlo  study  will  be  presented  which  compares  the  CD 
test  with  the  two  tests  presented  in  this  chapter.   For 
each,  the  asymptotic  distribution  will  be  used  for  samples 
of  size  25  and  40  to  investigate  how  the  statistics  compare 
under  the  null  and  alternative  hypotheses  for  various 
distributions.   First  though,  we  make  some  comments  on  this 
chapter . 


Comment  1 


In  Section  3.2,  the  test  statistic  T    „  ,  conditional 

nl'nc 

on  n  ,  was  presented  which  had  a  null  distribution 
c '      r 

equivalent  to  the  Wilcoxon  signed  rank  statistic.   If 
instead  of  conditioning  on  n  ,  the  statistic  had  been 
presented  (with  some  minor  adjustments)  conditional  on  n2 
and  no,  the  statistic  would  then  have  had  a  null 
distribution  equivalent  to  the  Wilcoxon  rank  sum 
statistic.   Conditioning  on  n   and  not  on  n2  and  n^  was 
chosen  because  the  observation  of  a  particular  n2  and  Ti2    in 
itself,  seemed  important.   That  is,  if  only  type  3  pairs  had 
occurred  (ignoring  the  number  of  type  4  pairs)  that  was 
significant,  since  under  the  null  hypothesis,  the 
probability  a  bivariate  pair  is  type  3  is  equal  to  the 
probabiltiy  the  pair  is  type  2.   The  signed  rank  statistic 
incorporates  this  idea  and  thus  was  used. 
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Comment  2 


In  Section  3.3,  the  Kaplan- Meier  estimate  of  the 
survival  distribution  was  used  in  estimating  the  common 
location  parameter.   The  usual  median  estimator  (the  sample 
median)  could  not  be  used,  because  in  the  presence  of  right 
censoring  this  estimator  is  negatively  biased.   Thus,  the 
"smoothed"  estimator  based  on  the  Kaplan-Meier  estimate  of 
the  survival  distribution  was  the  logical  choice. 

Comment  3 


The  tests  presented  in  this  chapter  are  not  recommended 
for  situations  in  which  heavy  censoring  occurs  early  on, 
that  is,  a  lot  of  censoring  in  the  smaller  measurements.   If 
this  heavy  censoring  was  to  occur,  many  type  4  pairs  would 
be  present  in  the  sample  which  are  not  used  in  the 
calculation  of  the  test  statistic  other  than  to  estimate  the 
common  location  parameter.   This  test  was  more  designed  for 
situations  when  the  extreme  values  (i.e.,  the  larger  values) 
tended  to  get  censored. 

Comment  4 


In  this  chapter,  statistics  were  presented  to  test  for 
differences  in  scale  when  (1)  the  common  location  parameter 
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was  known  or  (2)  the  common  location  parameter  was 
unknown.   The  next  natural  extension  would  be  to  test  the 
null  hypothesis  of  bivariate  symmetry  versus  the  alternative 
that  differences  in  scale  existed  with  unknown  location 
parameters  which  could  be  potentially  different.   This  idea 
could  be  incorporated  into  the  test  statistic  by  using 
separate  "smoothed"  estimators  for  X..  and  X-,.   This  idea 
will  be  further  investigated  in  Chapter  Four. 


CHAPTER  FOUR 

A  TEST  FOR  BIVARIATE  SYMMETRY  VERSUS 
LOCATION/SCALE  ALTERNATIVES 


4 . 1   Introduction 


In  Chapters  Two  and  Three  test  statistics  were 

presented  to  test  the  null  hypothesis  of  bivariate  symmetry 

versus  the  alternative  hypothesis  that  the  marginal 

distributions  differed  in  their  scale  parameter.   This 

chapter  will  consider  a  test  for  the  more  general 

alternative,  that  is,  that  the  marginal  distributions  differ 

in  location  and/or  scale.   To  do  this,  two  statistics  will 

be  made  the  components  of  a  2-vector,  W  ,  of  test 

statistics.   The  first  statistic  denoted  TE„   _   is  a 

nl  'nc 

statistic  which  is  used  to  detect  location  differences.   It 

was  introduced  by  Popovich  (1983)  and  is  somewhat  similar  to 

the  statistic  introduced  in  Chapter  Three.   The  second 

component  of  the  2-vector  will  be  a  statistic(s)  which  is 

designed  to  test  for  scale  differences.   Three  different 

statistics  will  be  considered  for  this  second  component. 

They  are  (1)  TMn      (Chapter  Three,  Section  3.3),  (2) 
nl  'nc 

TM_   _   but  using  separate  location  estimates  for  X    and 
X21  and  (3)  the  CD  statistic  (Chapter  Two).   It  will  be 

shown  in  Sections  4.2  and  4.3  that  the  distribution  of  W   is 

90 


91 

not  distribution  free,  even  when  H   is  true.   Thus,  if  E   is 

o  '     rn 

the  va r iance-cova riance  of  W  ,  the  quadratric  form  W  t~    W 

~  n '       ^  ~nrn~n 

will  not  be  distribution-free.   A  consistent  estimator  of 

rn'  fn  wiH  be  introduced  in  Section  4.5  and  a  test  based  on 

the  asymptotic  distribution-free  statistic  W*  fcw   will  be 
'  ~nrn~n 

recommended  for  large  sample  sizes.   For  small  sample  sizes 
a  permutation  test  will  be  recommended.   First  though,  we 
introduce  the  TE  statistic  by  Popovich  (1983)  with  a  slight 
change  in  notation  to  accommodate  this  thesis. 

Let  D.^  =  Xj^  -  X2.  and  R(  |  D .  I  )  be  the  absolute  rank  of 
Di  for  i  =  l,2,...,n,  that  is,  R(  j  D .  |  is  the  rank,  of  j  D^  I 
among  ( | D, | , | D* | , . • . » | D  | ) .   Define 


t±    =  ¥(DjL)  = 


1    if  Z.  >    0 


0    if  Z.  <  0 


Let  TE_   and  TEn   be  defined  to  be  the  following 
nl         nc 


and 


TE   =     V    *_,  R(  I  D.  I  ) 
"l     1-1    X    '  l! 


TEn   =  N3  -  N2  . 


Notice  that  TE„  ,  is  the  Wilcoxon  signed  rank  statistic 
nl 

applied  to  the  n^  totally  uncensored  pairs.   Popovich  (1983) 

showed  under  HQ  ,  N-j  is  distributed  as  a  Binomial  random 

variable  with  parameters  n   and  p  =  V2  P2(0)  =  V2  P(type  2  or 

3  pair).   With  a  slight  modification  from  Popovich,  the 

statistic  TE„   „   is 
n1,nc 


where 


and 
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KlT,(TE„  )  +  K9  (TEn  ) 


In v   n 


1      2n^nc 


TE 


TE   -  n. (n,  +  l)/4 
n  1    1   1 


(n1(n1  +  DC211J  +  l)/24)'2 


TE 


TE 


Cn  ft 


and  Ki   and  Ko   are  a  sequence  of  random  variables 
satisfying : 

^  ^ln  anc*  ^2n  are  on-'-y  functions  of  N,  and  N  , 


2)  there  exists  finite  constants  K,  and  K2  such 

P                 P 
that  K,„   ■»■  K,  and  K0„   +  K0  as  n  *  ». 


In 


"2n 


This  is  slightly  different  from  the  statistic  Popovich 

introduced,  the  difference  being  that  he  required 

K,   =  (I'Koj,)  which  is  not  being  required  here. 

One  comment  before  proceeding  to  Section  4.2.   In  this 

Chapter,  type  4  pairs  will  be  ignored  (except  in  estimating 

the  location  parameter  for  the  scale  statistics).   This  has 

no  real  affect  since  TE_   n  >  TM       and  CD  are  not 

nl'nc      1  »  C 

affected  by  their  presence  (other  than  in  estimating  the 
location  parameter).   It  will  be  assumed  that  the  sample  is 
of  size  n  =  N,  +  N  . 
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4.2   The  W„  Statistic  Using  T    n 
~n  ul »  c 

The  first  statistic  to  be  considered  for  pairing  with 

TE       is  similar  to  the  statistic  TM       presented  in 
nl>nc  nl'nc 

Section  3.3.   The  difference  being,  that  instead  of  using  a 
common  estimate  for  \i    as  in  Section  3.3,  here  we  first 
consider  using  separate  estimates  which  are  denoted  by  Mj 
and  M2  where  Mj  is  the  Kaplan-Meier  estimate  for  y  based  on 
the  Xji's  alone  and  similarly,  M2  is  the  Kaplan-Meier 
estimate  for  p  based  on  the  X2i's.   Define 


T   =   V   ?.R(   X..-  M_   -   X  •  -  M    ) 
n    .^.   l   Il2i    zi     ill    i'i 


and 


n,  +n 
1   c 

I         Y.  Q. 
lc   J-S  +1  J  J 


(Note  these  are  similar  to  statistics  defined  in  Section 
3.2,  with  a  slight  modification  of  using  the  separate 
estimators  M,  and  M2 • )   Similarly,  define 


T   = 


T    -  n.  (n,+l)/4 
n      11 


nl    n1(n1+l)(2n1+l)/24 


and 


T   -  n  (n  +l)/4 
n     c   c 


nc    n  (n  +l)(2n  +D/24 
c   c       c 
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which  would  be  the  standardized  versions  of  T    and  T    in 

nl       nc 

Section  3.3,  had  not  M,  and  M9  been  used.   We  now  defined 


K,  TE   +  K„  TE 
In   n,     2n   n 
1  c 

In    1       *  * 

L.  T    +  L0  T 

Inn,  2n  n 

1  c 

where  L,   and  L0   are  a  sequence  of  random  variables 
in       zn  ^ 

satisfying: 

1)  Li   and  L0„  are  functions  only  of  N,  and  N 

In       /  n  J  1       c, 

and 

2)  there  exists  finite  constants  L,  and  L^  such  that 


'In 


•+  L,  and  L 


2n 


Lo  as  n  *  =°  . 


Note,  the  statistic  for  scale  L ,  T  „   +  L  ~  T  „    is  slightly 

1  n  n ,     z  n  n  °    •> 

1         c 

different  than  the  forms  presented  in  Chapter  Three.   The 

difference  is  that  here,  the  two  components  are  standardized 

before  taking  the  linear  combination.   Appropriate  weighting 

variables  can  be  chosen  though,  which  make  this  form  of  the 

statistic  equivalent  to  that  presented  in  Chapter  Three. 

The  test  statistic  for  the  alternative  of  differences 

in  location  and/or  scale  is  W^  \\~      wln  wnere  \\     is  tne 

variance-covariance  matrix  for  W^n.   The  derivation  of  the 

asymptotic  distribution  of  W^  fl    Wln'  wil1  be 

accomplished  in  a  series  of  proofs.   Theorem  4.2.1  shows 

that  under  HQ ,  if  the  common  location  parameter  was  known 

and  used  instead  of  the  estimators  M,  and  M2 ,  the  vector 

T  =  (TE_  ,T_  (u),TE*  ,T*  )"    has  a  limiting  multivariate 
nl   nl       nc   nc 

normal  distribution.   Here  T   (y)  denotes  the  statistic  T 

nl  nl 
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when  the  value  of  p  is  used  in  its  calculation.   T   (y)  is 

nl 

the  standardized  Tn   which  was  presented  in  Section  3.2. 
Next,  Theorem  4.2.3  will  prove  that  using  the  estimates,  M, 
and  M2  for  the  common  location  parameter,  does  not  affect 
the  asymptotic  results  in  Theorem  4.2.1.   This  is  achieved 
by  applying  results  about  U-statistics  with  estimated 
parameters  (Randies,  1982)  which  are  stated  in  Theorem 
4.2.2.   Finally,  the  asymptotic  distribution  of  W^  fi    win 
will  be  presented  in  Theorem  4.2.5. 

Theorem  4.2.1:   When  \i    is  known  and  used  in  calculating  T   , 

under  H   and  conditional  on  N,=n,  and  N  =n  , 
o  lice' 

*  \ 

TE 


T  = 


N(0,fT)  , 


where  ^.j    =  ((o^  a'  '  )  is  the  4x4  variance-covar iance  matrix 
for  T  with 

0t(1,D  .  0t(2,2)  =  ffT(3,3)  .  0t(4,4)  .  1  > 

a-^1'2'  =  12P   (as  defined  on  page  99)  , 

aT<3»4>  =  -(3/4)V2  , 

ai(l,3)  =  ^(2,3)  =  ai(l,4)  =  0t(2,4)  =  0  . 


Proof  : 


Recall  in  the  proof  of  Theorem  3.4.1.,  it  was  shown 


that 
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and 


T*      =    (nc)^(3)^(D4i       -V2  )    +    op(l)       , 


where    U 


2,n 


II   f(|x21-  W 

UK]  <n 


xii-  A  +  lx2j-  A  ~  lxij"  H> 


and 


U.  =   —  I  I  {y.VCc.-    c,   )    +    y.  ^(c,  -    c     )} 

4'nc       fnc,     n1+l<j<k<n1+n    J  J  k  k         k         j 

«  1  1       c 


Similarly,     Popovich's    statistics    can    be    written    as 
K,    =    Cn^Vz  (3)/2  <U1>ni-  V2  )    +    op(l) 


and 


where 


and 


TEn   =  <nc>   U3,n 
c  t 


1   n   ■  — I      I       ^Xn"   X0-+   X1-_   X9  "  > 

1,nl       (nl)       Ui<j<n.       Xi         2l         lj         2j 


n    +n 

U,  =   —         I        (1-2Y,)        • 

3  ,  n  n        .  ,  ,  J 

'     c  c    i=n    +1 
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Thus  , 


TE 


Tn>> 


TE, 


(n^  (3)^   (U1§ni-  V2   ) 
Ul)1/2   (3)V2   (U2>ni-V2   ) 
(nc)1/2(u3>nc) 
(nc)^(3)^(n4f      -V2) 


and    therefore,     if    we    can    show    the    right    hand    side    has    the 
appropriate    distribution,     the    proof    will    be    complete. 


First,     it    will    be    shown    that 

n/2(U3>tlc) 
nV2(U4>n   -V2   ) 


N(0,1  ) 
-  "u 


where 


t       =  ((oU'b)))  and  oU'b)  =  I    — T —  ,U'b) 
u  .  L  ,    A  .       1 


2 

I 
i  =  l 


r(a)r(b) 


n 


A.=  lim  [ )  and  j;   '    is  the  covariance  term  described  in 

n-»-°°    n        i 
Theorem  3.6.9  of  Randies  and  Wolfe  (1979,  pg  .  107). 


Note,  conditional  on  N,=n,  and  N  =n  ,  the  problem  can  be 
considered  as  a  two  sample  problem.   By  Theorem  3.6.9 
(Randies  and  Wolfe,  1979,  pg.  107)  it  follows  that 


u  = 
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«^Cu2ini-^) 

n/2(U3)nc) 
n/2(U4,n  -V2  ) 


— ♦   "(o,  fu), 


where  Ju  =  ( ( a ( a ' b }  ) )  and 


-  (a)  (b) 

2  r  .   r  .      ,    . 

cu\     v  i    i     (a,b) 

a(a>b>  =   I   i C 


i  =  l 


l 


for  ^  =  lim  (— )  and  (r^a  ,r.    )   the  degrees  for 


U-statistic  Ufl .   Here,  Uj  Q   and  U2     are  of  degree  (2,0) 
03>n   is  of  degree  (0,1)  and  U4  Q   is  of  degree  (0,2).   We 
now  evaluate  the  matrix  I  . 
Now , 


(2,2)  u       2^2   (2,2)    0x0    (2,2) 
Xj  cl         X2  ^2 


2x2    (2,2) 


where 


?1(2>2)  =  Cov{4'(|x2i-  p|  -  |XU-  |i  |  +  |X2J-  y|  -  |x1;j- 


T  (  I  X2±~  M  "  lXli-  A     +  lX?V-  U|  "  |Xllr-  u|) 


v2k 


'Ik 


=  E{y(|x2i- 


XM-  ii   +   X,,-  U 


2j 


x>1'(lX2i"  •*!  _  lXli"  A     +     lX 


2k 


I  "  lXlj"  u|) 

:lk-  y  |  ) }  -  V4  . 


Notice  that  under  HQ  ,  |x2±-  u|  -  |x1;L-  u|  is  symmetrically 
distributed  about  0.   Thus, 


and 


S  imi larly , 


where 
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?1(2,2)  =  1/3  _  l/4  =  1/12 

(2'2)  =  l/Oxp  . 


flU.l)  =  i><2  r  (l.D 


?1(1'U  =  Cov{f(XljL-  X2.+  K   -  X   ),  T(XU-  X2.+  Xlk-  X2k) 


=  1/12 


Likewise , 


o(1»l)  =  1/(3^)  . 

a(3,3)  =  (i/x2)52<3»3>  =  (l/X2)Cov{l-2Yi,l-2Yi} 
=  (l/X2)4xVar(Yi)  =  1/A2  , 

a(4'4)  =  (4/X2)  Cov{YiT(c.-c.)  +  Yj^(cj-ci), 
YinCi-ck)  +  Yk*(ck-c±)} 
=  4/X2(  1/12)  =  1/(3X2)  , 


(1,2)  ,   2x2_  _(1,2)   0x0   (1,2) 

Xl  "l  X2       2 


=  (4/X1)Cov{?(|x21-  y|  -  |XU-  M|  +  |X22-  U|  ~  |X12-  U\) 


tCXjj  x21+  x13  x23) 


=  (4/A1){Pr  [(  |x21-  h|  -  |xn-  y|  +  |x22-  u|  -  |x12~  u|)  >  0, 

(xu-  x21+  x13-  x23)  >  0]  -  V4  } 


100 
i    4P*/X1 


(3'4)  =  (2/x2n2<3>4> 

=    (2/X2)Cov{Yi4'(ci-ck)  +  YkY(ck-ci),     1-2y±} 
=    -(4/X2){E[Y?4'(c;L-ck)+Y:LYk,i'(ck-ci)J     -  V4   } 
=    -(4/X2){E[Yi*(ci-ck)]+E[YiYkH'(ck-ci)]    -  V4} 
=    ~(4/X2)(   V4  +  \  x  V2  x  V2  -  \\ 

( by    Lemma     4.2.1) 
=    -1/(2X2)        , 


and 


a(l,3)    =    a(2,3)    =    Q> 
Thus ,    we    have 


n/2(U2>nrV2) 

nV2(U3,nc) 

.*/2  (u4>n  -V2  ) 


— +      N<9»  tu^ 


where 


1/(3X1)  (4/X^P*  0 

tu=  |     (4/xpP*       l/OX^  0 

0  0  1/X 


0 
0 
2  -1/(2X2) 

2,       1/(3X2 


Define    a    4x4    matrix   A    to    be, 


A    = 


(3X1)1/2 

0 

0 

0 

0 

(3X^2 

0 

0 

0 

0 

(X2)l72 

0 

0 

0 

0 

(3X2)V2 
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and  applying  Corollary  1.7.1  in  Serfling  (1980,  pg  .  25),  it 
follows 

(nA^  (3)V2  (U1>ni-  V2  ) 

(nX^  (3)V2  (U2>n  -  V2  )  ]  d 

(nX2//2  (U3)nc) 

(nX2)l72  (3)V2  (U4>n  -  V2  ) 


AU  = 


N(0,|T) 


where 


TT  "  AtuA' 


1 

12P* 

0 

0 

12P* 

1 

0 

0 

0 

0 

1 

-(3/4)l72 

0 

0 

-(3/4)l72 

1 

Recalling,  that  X^=   lii 

n-*-c 
follows . 


,  the  proof  of  Theorem  4.2.1 


a 


The  next  step  in  proving  the  asymptotic  distribution  of 
W, '  I,    W,    ,  will  be  to  show  that  the  estimators  M,  and  M2 
do  not  affect  the  asymptotic  distribution  of  T.   Theorem 
4.2.2,  states  results  about  U-statistics  with  estimated 
parameters  (Randies,  1982).   This  theorem  condenses 
Conditions  2.2,  2.3,  2.9A,  Lemma  2.6  and  Theorem  2.13  from 
Randies  (1982)  into  the  statement  of  Theorem  4.2.2. 


102 
Theorem  4.2.2:   Given  the  following  three  conditions 


and 


1)  Assume  there  exists  a  B,>0  such  that 

j h(x1 ,  .  .  .  ,xr ; Y)-h(xL ,  .  .  .  , xr; u  )  |  <  Bj  for  every 
Xj,..,,x   and  all  y  in  some  neighborhood  of  u,  where 
h(  •  ;t)  denotes  the  kernal  of  the  U-statistic  U  . 

2)  Suppose  there  is  a  neighborhood  of  X,     call  it  K(X) 

and  a  constant  Bo>0  such  that  if  yeK(X)  and 

D(y,d)  is  a  sphere  centered  at  y  with  radius  d 

satisfying  D(y,d)c  K(X)  then 

E[   Sup      |h(X  ,.. .,X  ;y')  -  h ( X   . . . , X  ; y ) | ]  <B„d 

Y'eD(y,d)     L       r  1       r  2 

(Condition  2.3) 

3)  Assume  E  [ h ( X, , . . . , X  • y ) ]  has  a  zero  differential 

at  y  =\i  ,  that 

n72  [p  -  y]  =  0  (1) 
P 

n72  [U  (U)  "  E(U  (y))]  —  *  N(0,  a2), 


where 


then 


Proof 


a   =  Var{E[h(X1 ,. . . ,Xr;y)|xi]}  >  0 
(Condition  2.9A) 


n72  [U  (y)  -  U  (y)l  —  ♦   0   . 


See  Randies  (1982). 


D 
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Theorem  4.2.3:   Under  HQ  , 

(n  )72  [T*  (y)  -  T*  (M.,M.)1  -B-+       0  . 
1      ni         n    1   2  J 

where  T_  (p)  is  the  statistic  T    which  used  u,  while 
nl  nl 

Tn  (M^,M2)  is  the  statistic  which  use  estimates  of  u, 
M,  and  Mo  . 

Proof : 


The  proof  of  Theorem  4.2.3.  follows,  if  the  conditions 
of  Theorem  4.2.2  hold.   Although  Theorem  4.2.2  has  been 
stated  here  in  terms  of  one  parameter,  the  theorem  is  valid 
for  a  p-vector  parameter  (i.e.,  u).   Thus,  the  unknown 
parameter  in  this  case  is  (u^,^)  which  is  being  estimated 
by  (M^,M2).   Next  we  need  to  show  that  the  necessary 
conditions  hold. 

Note,  Condition  1  follows  directly  from  the  fact  that 

the  kernel  for  T_   is  an  indicator  function,  that  is 
nl 


no   = 


1    t>0 


0    t<0 


and  thus 


H|X2.-Y2|  -  |X1JL-Yl|  +  |X2J-Y2|  "  |xirYl|) 

-  v(|x21-u|  -  I  xi  i  -vj  I  +  |x2j-y|  -  |xij-u|)|  <  1 
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To  prove  Condition  2  holds,  it  needs  to  be  shown  that 


E[   sup     h(|X2i-Y2 
X  '  e  D  ( i ,  d ) 


Xli~Yll  +  lX2j-Y2 


Xlj-Yl|) 


-    H|x2i-Y2|     -     |X1±-Y1  I     +     |X2J-Y2|     -     |xirYl|)]      <    B2d. 

(4.2.1) 

First,  consider  the  following  change  of  variables,  let 

Yli  =  Xli"  Yl  and  Y2i  =  X2i"  Y2*   This  simplifies  (4.2.1)  to 
showing 


Y 


lj 


E[   sup     T(  Y2±   +  |Y2.|  -  |y 
YeD(Y.d)  ZJ' 

n|Y2i-(Y2-Y2)|+|Y2j-(Y2-Y2)|-|Yli-(Yl-Y;)|-|Ylj-^1-Y;)|)| 


<  B0d 


(4.2.2) 


Recalling  that  |  f  ( t )  -  If  ( t  *  )  j  <  1,  we  need  to  show 


P'  :  M;'  !  Y0,  |  +  |Y2j 


2i 


Ylil  "  lYlj 


^|Y2i-(Y2-Y2)|+|Y2j-(Y2-Y2)|-|Yli-(Yl-Y;)|-|Ylj-(Yl-Y; 


=  1  1  <  B0d 


(4.2.3) 
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To  prove  (4.2.3)  we  will  utilize  the  fact  that  |y2~Y2|  <  ^ 
and  J  Y  2  ~ Y  2  I  *  ^  (i.e.,  y  eD(y>d))  and  first  consider  the 

region  where  |  Y  -^  -^  j  >  d  and  I  Y2]l  >  d  for  k=i,j.   Without 

1 
loss  of  generality,  it  will  be  assumed  that  Y2>Y2  an^ 

■ 
Yi>Yi«   It  will  be  argued  that  this  region  can  be 

appropriately  bounded,  and  similarly  that  the  regions  which 

have  not  been  included  here  can  be  bounded  also. 

For  the  first  region  we  are  considering  (i.e., 

lYlkl  ■*  ^  anc*  lY2kl  ^  ^  '  k=i»J)  notice  that  it  can  be 

1  '  1 

divided  into  16  subregions  determined  by   Yj,-(yi-Yi)   and 

I  Y2k~^  T2_Y2-M  k=i»J  that  is,  determined  by  whether  Y,^  >  d 

or  <  -d  for  k=i,j  and  whether  Y~.   is  >    d  or  <  -d  for 

k  =  i,j.   Consider  the  subregion  where  Y,.>  d,  Y2.>  d,  Y,  .>  d 

and  Y2->  d.   In  this  region  (4.2.3)  simplifies  to 


Pr[|HY2i+  Y2j-  Yu-  YU) 
"  ^Y2i+  Y2j"  Yli"  Ylj"  2(Y2-Y2)  +  2(Yl-Y;) 


=  1 


(4.2.4) 


Letting  Y   =  Y2±  +  Y2j  -  Y1±  -  Yj,,  (4.2.4)  becomes 
P[|*(Y*)  -  ?((Y*-  2(y2"Y2)  +  2(Yi-yJ))|  =  1] 
which  is  equal  to 


2(y2"Y2)  "  2(y1~Y1) 

*    * 
f(y  )dy 


if   2(y2"Y2)  "  2(y1"Y1)  >  0 
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f(y  )dy 


if 


2(y2-Y2) 


2^-Y^  <  0  , 


2(Y2_Y2)  ~  2(Y1"Y1 ) 


•k  ft  & 

where  f(y  )  denotes  the  density  function  of  Y  .   Now  f(y  ) 
is  bounded,  if  X,,  and  X21  have  a  bounded  joint  density. 
Letting  this  bound  be  denoted  by  B  (finite),  then 


2(y2~Y2)  "  2(Y1"Y1  ) 

*    * 
f(y  )dy 


X 


2(y2~Y2)  "  2(Yl-Yl) 


d(y) 


and  similarly, 


u 

/ 


<  4Bd 


f  (y*)dy* 


<   4Bd  . 


2(Y2"Y2)  "  2(y1"Y1) 


Thus,  this  subregion  is  bounded.   With  similar  arguments, 
the  remaining  15  subregions  can  be  shown  to  be  bounded  with 
the  same  type  of  expression. 

Similarly,  the  other  regions,  (i.e.,  {JYi^l  <  d  and 

lY2kl  <    d'  k=i'JK  {lYlkl  >  d  and  lY2kl  <    d'  k=i>J>»  etc-> 
can  also  be  bounded  by  K, d  for  some  constant  K, .   This 

completes  the  proof  of  Condition  2. 

For  Condition  3,  under  the  following  conditions, 


J  |     (       f  (u-s ,u)f  (v+s , v)  ds  du  dv 

0    0    -  v    X  X 
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and 


0    0   -v 
=  J    f       J     f  (u-s  ,u)f  (v  +  s , v)  ds  du  dv 


oo  0  v 

J  f  f      f  ( v-s , v ) f  (-u+s , u )  ds  du  dv 

0  -oo  U 

co  0  V 

=  /  J  f      f  (u-s  ,s)f  (-v  +  s  ,  v)  ds  du  dv  , 

0  -oo  u 


where  f „( • , • )  represents  the  density  of  (X,,,X2i), 


(4.2.5) 


it  can  be  shown  that  the  differential  is  zero.   For  the  next 
requirement  of  Condition  3,  under  certain  regularity 
conditions,  it  can  be  shown  that 

n'2[M  -  y]  =  0  (1)   for  1-1,2  . 
The  regularity  conditions  which  are  required  are  the 
following : 

a)   that  FY  (•)  is  continuous,  where  FY   is  the 


X, 


marginal  c.d.f.  for  X.,  i=l,2  , 


b)   that  G( ♦ )  is  continuous, 


and 


c)   G(FX|(  l/2  ))  <  1  . 


Note,  conditions  a  and  b  are  satisfied  by  assumptions  A.2  and 
A3  and  that  condition  c  requires  the  censoring  distribution 
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to  have  support  which  includes  the  location  parameters 

y   and  \i  „  which  is  satisfied  by  assumption  A6  .   (See  Sanders 

(1975).)   Thus  Condition  3  holds  since 

n72  [u  (u)  -  E(U  (u)]  —  >   N(0,o2) 
n  n 

was  shown  is  Section  3.2.  and  therefore  the  proof  of  Theorem 
4.2.3.  is  complete.  □ 


Corollary  4.2.4:   Under  HQ  , 


V«   *       *    p 

n2[T   (u)  -  TM   ]  *   0 


where  TM„  denotes  the  T   statistic  which  uses  the  combined 
nl  nl 


sample  estimate  for  u. 


Proof  : 

This  can  be  viewed  as  a  special  case  of  Theorem 
4.2.3.   Note,  in  this  case  it  is  easily  shown  that 
E  [h(X. , . . . ,X  ; y ) ]  has  a  zero  differential  by  noting  that 

X2^~y   ~   ^ii~Y   has  a  symmetrical  distribution  about  0  for 
any  y*   The  extra  conditions  stated  in  (4.2.5)  are  not 


needed . 


D 


It  has  been  shown  in  Theorem  4.2.3  (or  Corollary  4.2.4) 
that  using  the  estimates  M,  and  M2  (or  M)  does  not  affect 
the  limiting  multivariate  normal  distribution  of  the  vector 
T  of  test  statistics.   The  last  major  theorem  of  this 
section,  Theorem  4.2.5.,  states  the  resulting  asymptotic 
distribution  of  the  quadratic  form  Wf   l7   wln*   In  chis 
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theorem  T   (M,,M9)  will  denote  the  T    statistic  using  th 


■n,  v"l  ,ll2 


g  the 


separate  estimators  M,  and  M9 . 


Theorem  4.2.5;   Under  H  ,  the  following  are  true, 


K,  TE    +  K„  TE 
In   n      2n   n 

(1)   ?ln  =  (      *  °    * 

LlaTn1(Ml'M2)  +  L2nTn 
1  c 


NCO.jlj) 


wh 


ere  ^  =  (a(a'b))  with 


a/1'1'  =  k2  +  k2 
ax(2'2>  =  L2  +  L2 


and 


ai(1'2)  =  nKjLjP*  -  K2L2(3/4)1/2 


<2>   "in  tl1  Sin    — ^  X(2)   , 


(3)   if  |iis  any  consistent  estimator  of  I,,  then 


Hn  ?Ilwln   —  ♦  X(2) 


Proof 


To  prove  part  (1),  note  that  from  Theorem  4.2.3.,  w« 


have 


T  = 


Tn(M1,M2) 


— >   N(0,|  )  , 
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where  J™  was  defined  in  Theorem  4.2.1.   Defining  a  matrix  A 
to  be 


Kj    0    K2   0   \ 
A  =  |   ~  and  applying  Theorem  A  in  Serflinj 

0     Ll   °    L2 


(1980,  pg.  122)  we  see  that 


Wln  =  AT  — ♦  N(0,  A|TA') 


where 


AT  = 


K,TE   +  K0TE 
In,    2   n 
1        c 


L  T   (M  ,M  )  +  L  T 
In,    12      2  n 
1  ( 


fl  "  AtiA 


Thus,  part  (1)  follows  by  noting  that 


K.  TE   +  K„  TE   -  (K..TE   +  K„TE 
Inn,    2nn    '-In,    2n 
1         c         1        c 


=  f K.  -  K,  TE   +   K„  -  K_  TE 
^  In    1  ;   n,   *•  2n    2'       n 
1  ( 


and  since  TE„   and  TE„   converge  in  distribution  to  standard 
nl         nc 

*  * 

normal  random  variables,  TEn   and  TEfl   are  0(1)  (Serfling, 

1     p    c       P 

(1980),  pg.8).   Also,  since  (K,  — +  K,) 

In      1 
P 
and  (K2n ->•  K  )  as  n  >  °° ,  thus  (Kln~  Kj)  and  (K2n~  K2^  are 

op(l).  Therefore,  (Kln-  K^TE*   +  (K2n~  K2)TE*   is  o  (1).   A 

similar  arguement  holds  for  LlnTn  (M,,M2)  +  L2nTn   and  thus 

1  c 


Ill 

the  vector  W,   has  the  same  distribution  as  AT. 

-In 

Parts  2  and  3  follow  directly  from  (1)  and  well  known 


results . 


□ 


The  results  in  Theorem  4.2.5  also  hold  for 


!2n 


K,  TE    +  Kn  TE 
In   n,     2n   n 
1  c 


L.  T   (M)  +  L.  T 

In  n,         2n  n 
1  ( 


where  T_  (M)   denotes  the  T    statistic  using  the  combined 
nl  nl 

location  estimate  and  thus  will  not  be  stated  separately. 

The  quadratic  form  based  on  W0„  is  denoted  Wo„  I0   W0„  where 
n  ~  Zn  ~2nrz- zn 

rl  =  t2"   Note  that  each  quadratic  form  mentioned  in  this 
chapter  is  not  distribution-free,  although  each  is 
asymptotically  distribution-free.   Section  4.5  will 
investigate  consistent  estimates  for  X ,  . 


4.3   The  W3n  Statistic  Using  CD 


The  last  statistic  to  be  considered  for  pairing  with 


TE. 


l'uc 


is  the  CD  statistic  presented  in  Chapter  Two. 


Here,  the  W   statistic  will  be  denoted  by  H,  ,  to  indicate 
this  third  type  of  scale  statistic  used. 


Theorem  4.3.1:   Conditional  on  N,  =  n,  and  N   =  n  ,  and 
1     i        c     c ' 

under  H  , 


w, 

-3n 
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K.  TE    +  K0  TE^ 

In   n      2n   n 


CD 


N(0,L) 


where  io  is  the  variance-covariance  matrix  for  W3n  and  CD 
is  the  standardized  CD  statistic  of  Chapter  Two. 


Proof 


Recall  from  Chapter  Two  (ignoring  type  4  pairs),  that 


CD  =   — r I     I    a.  .b.  . 

rn.+n  v  L.  .h       ij  ij 


n  ,  +n  , 
1   c    i<j 


If  you  have  only  type  1,2,  or  3  pairs,  then  there  are  three 
possibilities  for  the  ith  and  jth  pair  types,  which  are 


1)  the  ith  and  jth  pairs  are  both  type  l's  (i.e., 


uncensore 


d,  6i=6j=l), 


2)  the  ith  and  jth  pairs  are  both  type  2*s  or  3's 

(i  .e  .  , (6 .  ,6  .  )e(2  ,3)  where  this  indicates  that  &± 
and  {.  are  both  elements  of  (2,3)), 


3)  the  ith  pair  is  type  1  and  the  jth  pair  is  type  2  or 
3  or  vice  versa  (i.e.  {6.=1  and  6  e(2,3)} 


or  { 6ie(2,3)  and  6  .=1} )  . 


Here,  it  will  be  assumed,  as  in  Chapter  Three,  that  the  type 
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1  pairs  occupy  positions  1,2,. ..,ni  in  the  sample  while  the 
type  2  or  3  pairs  occupy  positions  n^+1 , n2+2 , . . . , ni+nc • 
Thus,  CD  can  be  written  as 


CD  = 


— - —  I  y     y    3j  .b, .  +      y     y      a,  .b. . 

nl+nc)       l<i<j<n1    ij     ij       t^  +  UKj^+n^     1J 

n,       n.+n 
1  1       c 

L  L  1 1     i  i     J 

i-1     j=nx+l  J        J 

which  is  in  the  form  of  three  U-s tat  is t i cs ,  that  is, 


CD    = 


n,  +n 
1       c 


2^Ulc    +    (2>2c    +    nlX    ncU3c 


where 


and 


U,   =  —      I  I       a..b.. 

lc    ,nK  l<i<j<ni  1J  1J 


I         I  a..b.. 


2c     n 


c  -,  n,  +1  <i<i  <n  +n, 
1      J   c   1 


n,  n,  +n 
lie 


u,    -   — - —   y     y      a.  .b. .    . 

3c     nlXnc  1£1  ^+1    ij  ij 

Note,  that  U,   and  U"2C  are  one  sample  U-statistics  while  U-^ 
is  a  two  sample  U-statistic. 


Using  the  fact  that, 


TE„   „   =  K^TE^   +  K9nTEn 
n1  ,nc     In   n1     2n   nQ 


where 
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and 


TE 


*       -  (np^O^CU^^-  V2)  +  op(l) 


K       =  (nc)2(U3>    ) 


and  from  Theorem  3.6.9.  of  Randies  and  Wolfe  (1979,  pj 
107 ) ,  it  follows  that 


nV2(U3,nc) 
nV2  (Ulc) 
n72  (U2c) 
nV2  (U3c) 


— ♦  N(0,i  ) 


where  fu  -  (a(a'b))  and 


2   rU)r  (b) 
(a,b)=    y    i    i       (a,b) 

1  =  1     X. 

l 


Here  U^  n   and  U.   are  U-statistics  of  degree  (2,0),  Uo  n 
is  of  degree  (0,1).,  U2c  is  of  degree  (0,2)  and  U.   is  of 
degree  (1,1).   From  the  proof  of  Theorem  2.4.1,  we  have 
a(l,l)  =  i/OXj),  o{2*2)    =  1/X2  and  a(1'2)  =  0.   In 
addition, 

a(3'3)  =  (4/X1)Cov{aijb.j  ,  a  .  kb-k  |  6  ±-5  .,  =6k=l  }  , 


f(4'4)  =  (4/X2)Cov{ai:Jbi:j  ,aikbik|(61,6j,6k)£(2,3)} 
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a(5'5)  =  (l/X1)Cov{a.jbij  ,alkblk| 6±-l , (6 j , 6k)e(2 ,3 )} , 

+  (l/X2)Cov{a.kb.k  ,ajkbjk|6i=6j=l,6ke(2,3)}, 

,(3,4)  =  a(l,4)  =  ,(2.3)  =  0> 

a(3,5)  „  (2/A1)Cov{aijb.j  ,  aikbik |  6 ±  =  6 j  =  1 , 6ke ( 2 , 3 ) }  , 

a(4,5)  ,  (2/X2)Cov{ai:jbi;j  ,aikbik  |  (  6±  ,  6  j  )e  (  2  , 3  )  ,  6fc-l }  , 

a(1'5)  =  (2/X1)Cov{nx1.-X2i+Xlj-X2:J),  aikbik| 
5i=6j=l,6ke(2,3)}  , 

a(1.3>  =  (4/X1)Covfnx1.-X2i+Xlj-X2j),  a.kbik|6i=6j=5k=l} 

a(2'5)  =  (l/X2)Cov{l-2Yj  ,a1Jb1j|61-l,«je(2,3)}  , 

(,(2,4)  =  (2/x2)Cov{l-2Y:j  ,aijb1J|(61,«j)e(2,3)}   • 


Next,  we  get  the  distribution  of 


n   (Ul,ni  ~lk   >•  n   (U3,nc>»  n   <CD> 


Note  that, 


n/2  (CD) 


1/   1  nl 


2  )°2c  +  nlnc"3ct  • 
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lim 


=  lim 


n  (r^-1  ) 
n(n-l) 


=  X 


lim 


=  X. 


and 


lim 
n*°° 


=  lim 


2n,  n 
1  c 

n(n-l) 


2XXX2 


Thus,  we  need 

10     0     0 


0 


0 


1A2 


«ft<u1§ni  -V2  ) 

n1/2(U3>nc) 
nV2   (Ulc) 
nV2   (U2c) 
nV2   (U3c) 


»V2CDlfni-^) 
n/2(U3jIlc) 

^2   (XlUlc    +    X2U2c    +    2^i^2U3c) 


i'i  (u3jnc) 

A    (CD) 


with  variance-covariance  matrix  A  |UA'  =  J!qD  =  (o^D   '   ' 
where 

(1,1)  =  rt(l»D 


UCD 

G   (2,2)  =  G(2,2) 
aCD  a 


-CD(1'3)  "  ^ia(1'3)  +  2X1X2a<1'5> 
-CD(2'3)  =  ^2-(2'4)  +  2X1X2a<2>5> 


and 
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;cD(3'3>  =  aV3'3>  +  ZA^UX^)^3'5) 
+  2X2(2X1X2)c(4'5)  +  X2a(4>4) 
+  (2X1X2)2a(5'5). 


(  3  3  ) 
Next,  it  will  be  argued  that  o en       is  asynPtotica^y  the 

same  as  the  variance  for  CD  (i.e.,  4y)  derived  in  Section 
2.4.   Note  that, 


2         (35)     2  (45)     22(55) 

2X  (2X^2)0^  '  ;+  2X2(2x^2)  a   '  '+    4XLX2  a   '  } 


8X.  X.Covfa,  ,  b,  .  ,a  .,  b  ..  I  <5  -5  .  -1  ,  6.  e (2  , 3 )  } 

12    l  ij  ij    lk  lk  l  x   j     k       ' 

+  8X2X1Cov{aljbij  ,aikb.k|(6i,6j)e(2,3),6k=l 
+  4X1X2Cov{aijbij  ,aikb.k|5.=l,(6.,6k)e(2,3) 
+  4x'x2Cov{aikb.k  ,a.kb.k|6i=6.=l,6ke(2,3)} 


=  4X1X2Cov{ 

2      . 
+  4X  X2Cov{ 

2    . 
+  4X1X2Cov{ 


+  4X  X2Cov{ 

2    . 
+  4X  X.Covf 

2  i  , 

+    4X  X2Cov{ 


|61-«J-liake(2,3)} 
|«1-«k-l,6je(2,3)} 

|(6. ,6  )£(2,3),5k-l 
I  (6.  ,6,  )e(2,3),6  =1 

'IK  J 

I  S±  =  1  , (6  ,6k)e(2,3) 
|6.e(2,3),6j=6k=l} 


(3  3  ) 
Thus,  combining  all  the  terms  in  OpD   '   ,  we  have 


,D<3>3>  =  4X3lCov{ 

3 
+  4X  2Cov { 

2       , 
+  4X1X2Cov{ 


|6i=6.=6k=l} 
I  (6i,6j,5k)£(2,3)} 
|6i=6j=l,6k£(2,3) 
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+  4X1X2Cov 

2 
+  4X i X  oCov 

2 
+  4XJX2COV 

2 
+  4X1X2Cov 

,  2 
+  4X1X2Cov 


Recalling  that  , 


6i=6k=l,6je(2,3)} 
( 6±, 6 j ) e (2 ,3) , 6k=l } 
(6i,6k)c(2,3) .«j-l} 
6k-l,(6j,6k)e(2,3)} 

6ie(2,3),6j=6k=l} 


(4.3.1) 


—  =  (proportion  of  sample  which  are  type  l's) 

P 
+  X.  =  (probability  of  being  a  type  1), 


and,  thus,  the  X  coefficients  in  front  of  each  covariance 
term  are  the  probabilities  necessary  to  uncondition  each 


covariance  term.   For  example, 


LCov(a.  _.b,. 


6«  -6,  =5.-1) 


jbij  '  aikbikl6i=6j=6k 
=   P(6i=6j=6k=l)C0v(a.jb..  ,  a.kbik|6i=6j=6k=l) 

=   Cov(a..b.j  ,  •lkblk,«1-«J-«k-l)  . 

Now  note  that  the  eight  covariance  terms  correspond  to  the 
eight  possibilities  for  the  subscripts  i,  j  and  k  (i.e.,  3 

subscripts  with  2  possibilities  for  each,  that  is,  each 

3 
subscript  is  either  a  1  or  (2,3)  yields  2  =8  combinations) 

and  thus 

acD(3,3)  -  Cov(a.jb..  .a.,^,.)  =  4Y   . 


The  last  step  of  the  proof,  (i.e.  showing  Wo   +  N(0,I   ) 

w3 
follows  from  observing  that 
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KjOXp^    K2(X2)l72 


^^nf^)' 


(4Y)~L/2  J    x  j   A  (U3jn  ) 
A   (CD) 


^O^CnX^CU    -VjP  +  K2(nX2)1/2(U3>nc) 


CD 


by  Theorem  A  in  Serfling  (1980,  pg  .  122),  where  \3    = 
(a3<a>b)) 


N(0,L) 


a3 
'3 


n,n  =  K*  +  4  , 


c,<2'2>  -  1 


and 


n.2)  =  ^OX^2  {xja(1«3)  +  2X1X2a(1'5M(4Y)-l/2 
+  K2(X2)V2  {X2a(2'4)  +  2X1X2a(2'5)}(4Y)-1/2 


After  some  simplification,  similar  to  that  used  to  show  that 


'CD 


(3»3)   was  equivalent  to  4Y,  it  follows  that  03   '  '  is 


equal  to 


K1(4Y)-l/2  (3^)^  |4Cov[^(X1.-  X2.+  Xlj-  X,,  .  )  ,  a^b^  |  6.  =6  .  =1  ]  } 


+  K2(4Y)J/2(X2)1/2  {2  Cov(l-2Yj  »aijb.j  |Sj£(2,3))} 
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Recalling  that , 


1  c 

— —  — *--»■  A,  and   — —   —*--»■  A„  ,  and  using  a  similar 
N       1         N       2 


arguement  as  in  Theorem  4.2.5  (pgs.  109-110),  we  get 


*         * 

K,  TE   +  K„  TE 
In   n ,     2n   n 

1  < 


CD 


N(0,|  ) 
3 


and  the  proof  of  4.3.1  is  complete 


□ 


Note  that,  similar  to  the  case  with  W,   and  W^  , 
W3n  T3   W3n  is  not  distribution-free,  although  it  is 
asymptotically  distribution-free  . 

The  following  corollary,  states  results  which  follows 
directly  from  Theorem  4.3.1. 


Corollary  4.3.2:   Under  H  , 
J- o  » 


and 


(1)   ?3n  E32  ?3n  ~+       *(2) 


(2)  I,  is  any  consistent  estimator  of  h,  then 

-Id2 
^3n  Z3  ?3    — *       *(2)  * 

Proof  : 

The  proof  is  omitted,  since  (1)  and  (2)  follow  directly 
from  Theorem  4.3.1  and  well  known  results.  D 
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This  section  has  established  the  asymptotic 
distribution  for  the  quadratic  form  W,   Io   W,   which  could 
be  used  for  a  large  sample  test  for  the  general  alternative 
of  location  and/or  scale  differences.   The  next  section, 
will  discuss  a  permutation  test  which  could  be  performed  for 
any  of  the  quadratic  forms  (based  on  H ,  ,  W~   or  W-,  ) 
mentioned  in  Sections  4.2  and  4.3.   Section  4.5,  will 
discuss  consistent  estimators  for  I,,  I2  and  L. 


4 .4  Permutation  Test 

In  the  situation  where  the  sample  size  is  small,  there 

may  not  exist  a  good  estimate  for  I.  1=1,2,3  or  for  the 

2 
limiting  X(2)  distribution  to  provide  an  adequate 

approximation  for  the  distribution  of  W.   £7   W.   1=1,2  or 

3.   In  this  case,  a  small  sample  permutation  test  is 

recommended . 

Recall,  in  Section  2.3,  a  permutation  test  for  CD  was 

discussed.   It  was  based  on  the  2n  possible  samples 


[X11'X21'61)  1,(X12,X22,52)  2>...,(Xln,X2n,6n)  n]  : 
k^  =  0  or  1  for  i=l,2, 


,n: 


which  are  equally  likely  under  H  .   Here 
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(X1.,X2.,6.)  i  = 


(Xu,X2i,61) 


if  k 


(x     ,x     ,f(a±))     if  k. 


The  permutation  test  in  this  section  is  based  on  the  same  2 
samples  which  are  equally  likely  under  H  .   (A  slight  change 
is  present  though,  since  n  =  N,  +  N   here,  that  is,  no  type 
4  pairs  are  included  in  the  sample  for  the  calculations.) 

Without  loss  of  generality,  it  will  be  assumed  that 
W,   ti   Wi   is  the  test  statistic  for  which  the  permutation 
test  is  being  done.   The  permutation  tests  for  the 

statistics  based  on  W0   and  W,   are  performed  similarly.   It 

~2n      on      r 

is  also  assumed  that  a  particular  K,   and  K2n  (L,   and  Lon) 
have  been  chosen  by  the  researcher. 

Let  w,  (1)  denote  the  first  component  in  W,  ;  that  is 

the  location  statistic  which  is 

*  * 

w.  ( 1)  =  K.  TE    +  K-  TE 
In        Inn      2nnc 

and  let  w,  (2)  denote  the  second  component  in  W,  ;  that  is 
the  scale  statistic  used  for  W,.   For  each  of  the  2n 
equally  likely  samples,  the  statistics  win(l)»  win^^  anc* 

w,  w,  (2)  are  computed  and  their  values  tallied.   From  these 
In  1  n 

tallies,  the  relative  frequency  of  each  possible  value  of 

w,  (1),  w,  (2)  and  w,  (l)w,  (2)  is  determined  and  these 
In'   In  In     In 

relative  frequencies  are  then  the  probabilities  that  win^^' 
w,  (2)  and  win(l)w,  (2)  assume  the  corresponding  distinct 
values.   Using  these  probabilities,  the  actual  conditional 
variance  of  w.(l)  and  w,  (2)  can  be  calculated  and  the 


In 


'In' 
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actual  conditional  covariance  of  winO)  and  wi2n^^  can  ^e 
calculated . 

Let  i       denote  the  conditional  variance-covariance 
matrix  for  W,  .   Now  we  calculate  the  2n  not  necessarily 
distinct  values  for  Wln  j:"1  Wln  determined  by  the  2n  equally 
likely  samples  under  H  .   From  these  calculations  compute 
the  relative  frequency  for  each  distinct  value  of 

W, '  i_1  W,  ,  thus  obtaining  the  conditional  probability 

-In  Tc   -in' 

distribution  of  W,   j:"1  Wln«   The  null  hypothesis  is 
rejected  if  W,   t~      W,   for  the  actual  observed  sample  is 
too  large  according  to  this  conditional  distribution. 


4.5   Estimating  the  Covariance 


In  Section  4.3,  the  asymptotic  distribution  of 

WC   tT1  VI.       for  1-1,2,3  was  established.   In  each  case,  I. 
~ in  ri   -  in  ii 

depended  on  the  underlying  distributions  F(»,«)  and  G(«) 
(the  c.d.f.  of  (Xjj,  X2i)  and  C± ,  respectively).   Hence,  we 
can  not  perform  a  large  sample  test  based  on  W£Q  j:^   W^n 
unless  I.  i=l,2,3  is  known. 

This  section  will  discuss  estimation  for  the  components 
of  t,  i=l,2,3  which  depend  on  the  underlying  distribution. 
These  estimators,  when  substituted  into  the  appropriate 
quantities  they  are  estimating,  provide  asymptotically 
distribution-free  statistics  that  can  be  used  in  the 
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hypothesis  testing  situations  considered  in  this 
dissertation. 

For  the  var iance-covariance  matrix  I,  and,  thus,  I2 
since  it  is  identical,  the  term  in  I ,  which  depends  on  the 
underlying  distribution  is  a^1'2^  =  12KjL.jP*  -  K2L2(3/4)//2 
(page  109).   The  dependence  due  to  P  ,  which  was  defined  as 


P   =  Pt{(  X„-u   -   X,  ,-u   +  X 


•21 


11 


•22' 


12 


•u   >  0, 


(Xjj-  x21+  x13-  x23)  >  0  }  -  V4  , 


was  a  result  of  the  asymptotic  covariance  between  TE   and 

T_  .   Lemma  4.5.1  defines  a  consistent  estimator  for  the 
nl 

quantity  12P   (the  asymptotic  covariance  of  TE   and  T   ). 

n  J  J       r  n  i         n  | 

First,  though,  we  describe  some  notation  which  will  be 
needed .   Define 

xi  =  (Xj.,x2i), 


h(1)(X±)  =  f(|x2i-M|-|Xji-M| ), 


(2) 


h      <?i»?j)  =  f ( lx2i~Ml~lxii"Ml  +  lx2j"Mrlxij"Ml  > 


h(3)(x±)   =  f (x1±-  x2i) 


h(4)(Si.Sj)    =    ^Xli"    X2i+    xij-    X2j>' 


h(1'3)(X.)     =    h(1)(X.)h(3)(Xi), 
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h(1'4)(?i»?j)  =  h(1)(?i)h(4)(?i»?j> 
+  h(1)(Xj)h(4)(Xi,Xj) 

h(2,3)(?i»?j)  =  h(2)(X.,Xj)h(3)(Xi) 
+  h(2)(Xi,Xj)h(3)(Xj) 


and 


h22'4)(?i'?j>  "  h(2)(?i'?j)h(4)(?i»?j)» 


h[2'4)(?i.?j'?k)  *  h(2)(?i'?j)h(4)(?i 


+  h(2)(xi,xk)h(4)(xi 

+  h(2)(X.,Xj)h(4)(Xj 


(2) 


(4) 


+  h^/(xj,xk)hv'w(xi 

+  h(2)(Xi,Xk)h(4)(Xj 

+  h(2)(xrxk)h(4)(x. 


?k> 
5k> 

?k> 


The  quantities  h(1)(«),  h(2)(.,.),  h(3)(.)  and  h(4)(.,«)  are 

actually  the  kernels  of  U-statistics  or  kernels  of 

U-statistics  with  an  estimated  parameter  which  are  used  in 

the  representation  of  Tn   and  TEn  .   (See  page  74  for  the 

exact  U-statistic  representation  of  T   .   A  similar 

representation  for  TE„   can  be  defined.)   The  quantities 

1 
h   '   (*»•),  h^  '  '(♦,•),  etc.,  are  needed  to  calculate  to 

covariance  between  the  kernels  of  the  U-statistic 

representations  of  T    and  TE   .   A  consistent  estimator  for 
11 1        nl 

*     * 
Cov(T   ,TE„  ),  which  will  be  defined  in  Lemma  4.5.1,  can  be 

nl    nl 

viewed  as  estimating  the  exact  covariance  between  TE    and 

nl 

Tn   using  the  sample  covariance.   In  the  proof  of  Lemma 
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4.5.1,  it  will  be  argued  that  this  estimator  is  a  consistent 
estimator  for  the  asymptotic  covariance. 

Lemma  4.5.1:   Under  H  , 


Co^TE^.T^)  = 


-r     4     "(1,3)      4     ;(1,4)      4     ;(2,3) 
'   '  ""   2  C,      +  (ii,-!)  c,      +  (n,-l)  C 


(a1-l)£   1 


1   '    ! 


1   '    ! 


a.  4U1  2)  C(2,4)     __2 "(2,4) 


is  a  consistent  estimator  for  12P   (the  asymptotic 

covariance  of  TE„   and  T„  )  where 
nl       nl 


-(1,3).  JL  I      hU,3)      _  h(l)h(3) 
1        ni  i=i 


JCi.*>._L.  j  j  h(1'4)(x.  ,x.)  -  h(1)h(4) 


(2,3)_  1 


£-  I  I  h(2>3)(x   ,X  )  - 
1)  i<j  J 


h(2)h(3) 


'(2,4).    1 


i III  h(2'4)(X.,X.,Xl)  -  h(2)h(4 

nl]  i<i<k   l      ^  ^    ^k 


6["lj  i<j 
3 


£<*.*>  =    _L   M  h(2>4)(X.,X.)  -  h(2)h(4) 
2 


-i-   I  I  h(2'4)(X.,X.)  -  h' 
(nil  i<j   2       ^  ^ 


t/^  =  — !—   V   h(1^(Xj    (i.e.,  the  actual  sample 
n  ■•  i 

1  i  =  l 
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(1) 


value  of  a  U-statistic  with  kernel  hy       )  and  analogous 
definitions  for   ti    ,  t/ 3  ^  and  la   '. 


Proof : 


Note  that 


r       *    *        A(2  4  )  ■.    P 
Cov(TE   ,T   )  -  12c   *   1  —+    0. 

L       V  nl        1 


Thus,  the  proof  will  be  complete  if  it  can  be  argued  that 

A(  2  4  )  * 

X,       '    is  a  consistent  estimator  of  P  .   This  follows 

1 
directly  since  for  a  U-statistic,  U  ,  based  on  a  kernel,  h  , 

*    P    * 
U    — *  h   by  Hoeffding's  Theorem  (Hoeffding,  1961).       Q 


Note  that  this  is  just  one  of  many  possible  consistent 
estimators  for  12P  .   This  estimator  is  presented  because  it 
worked  well  in  the  Monte  Carlo  study  presented  in  Chapter 
Five.   Although  other  estimators  may  appear  to  be 
reasonable,  all  too  often  in  practice  their  determinant  will 
be  less  than  or  equal  to  zero.   Now  we  consider  estimators 
for  the  variance-covariance  matrix  for  W~,  the  quadratic 
form  using  CD. 

In  looking  at  the  variance-covariance  matrix  L  derived 
in  Section  4.3,  we  notice  immediately  that  the  estimation 
needed  here  is  more  complicated  than  that  of  I,  .   First,  a 
consistent  estimate  for  the  asymptotic  variance  of  CD  (i.e., 

4y)  is  needed.   Secondly,  we  need  to  estimate  two  asymptotic 

(  1  2  ) 
covariances  which  are  used  in  the  calculation  of  o"^ 
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* 
(i.e.,  the  asymptotic  covariance  between  TE       and  CD  ). 

1  '  c 

The  first  estimation  problem  has  already  been  taken  care  of 
in  Section  2.4.   The  method  of  solving  the  second  estimation 
problem  is  similar  to  that  used  for  estimating  12P  .   That 

is.  the  exact  covariance  between  TEn      and  CD   is  derived 

*  nl 'nc 

and  the  sample  quantities  are  then  used  in  its 

calculation.   Lemma  4.5.2  will  present  this  estimator  and 

argue  that  this  is  a  consistent  estimator  for  the  asymptotic 

covariance.   This  estimator  will  be  actually  in  the  form  of 

two  estimators;  one  which  is  estimating  the  covariance  of 

TE    and  CD  and  the  other  which  is  estimating  the  covariance 

nl 
of  TE    and  CD.   This  is  equivalent  to  estimating  the 
c 

quant  it  ies 


Cov[f(Xir  X2.+  X   -  X2j),  a.kbik|6.=6.=l 


and 


Cov[  (1  -  2Y.),  ai:jbi:j  I  <5j  e(2  ,3) 


in  o3^»2). 


First  though,  we  describe  some  notation  which  will  be 
needed.   Let  h(3)(Xi)  and  h**'(XlfXj)  be  defined  as 
before.   In  addition,  define 


(lc) 


=  t.(2c) 


(3c) 


(X1,Xj)  =  h^c^(X.,X.)  =  h^c'(X1,XJ)  =  a..bi:j, 


h<5)(Xj)  =  1-2Y-J, 


h(3,1C)(Xi»?j)  =  h<3>(Xi)h(lc>(X1,Xj) 

+  hC3)(Xj)h(lc)(X1,Xj)  , 


129 


hl(*,lc)<?i.?j.Sk>  =  h(4)(?i 

(4) 


+  h 

+  h<4> 


<Xi 


<Xi 


+  h<4)(Xj 


Xj)h 


(lc) 


(X, 


Xj)h(lc)(Xj 


Xu)h 


(lc) 


<Xi 


?k)h(1C)(?j 


Xu)h 


(lc) 


<xi 


xk)h(lc)(Xi 


xk> 

?k) 

xk> 
xk> 
xj>> 


and 


h2(4,1C)<?l»?j>  "  h(4)(?i.?j>b(1C)(?i.?J>. 

h(3,3c)(x.  )X_.)  =  h(3)(X.)h(3c)(Xi,Xj), 

h^'^^X^Xj,^)  =  h(4)(x1,xj)hC3c)(x1,xk) 

+  h(4)(Xi,Xj)h(3c)(Xj,Xk) 

h(5,2c)(xi»xj)  -  h(5)(!i)h(2c)(!i-!j) 
+  h(5)(xj)h(2c)(xi,xj), 

h(5»3c)(xi,xj)  =  h(5)(xj)h(3c)(xi,x.j). 


Note,  the  quantities  h^lc)(«,«),  h(2c)(«,«)  and  h(3c)(«,«) 
are  actually  the  kernels  in  the  U-statistic  representation 
of  CD  given  on  page  113.   The  quantities  h^ 3 '  1  c  '  ( •  ,  •  )  , 

1^  '  c  (•,*,•),  et c .  ,  are  needed  to  calculate  the  covariance 
between  the  kernels  of  the  U-statistic  representation  of 
TEn   and  CD,  and  TEn   and  CD.   In  Lemma  4.5.2,  the 
consistent  estimator  is  now  defined. 
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Lemma  4.5.2:   Under  H  ,  a  consistent  estimator  for 

K1(4Y)"l/2  (3X^2  [4Cov{T(X11-X2l+Xlj-X2j),  a±j  b±J  |  6.  -«j  =1 }  ] 

+   K2(4Y)  '2   (X2)/2  [2Cov{l-2Yj  »aijbij  |  6  e(2,3)}  ] 


is  the  followinj 


K  Uy)"^  (3AX  )V2  r   4    a(3'lG)  +  4U1"2)   J(4,lc) 


4n       a  ,  0  „  v     4n 


,    2    M4,lc)  ,        c      *(3,3c)  ,     c   *  ( 4  ,  3c) 

(n-1)  ^2       +  (n-lXnj-1)  S  (n-1)  ^ 

+  K  (4v)"1/2  (X  )V2  r2Uc"  -  r(5'2c)  +  2ni    ;<5'3c>l 

k.2^y;    u2;   [   (n_1)   ^      +  (n_1}   c;^     j 


where 


4y  is  a  consistent  estimate  for  4y, 


X        1 


1     n,+  n 
1    c 


Xo   = 


2     n.+  n   ' 
1    c 

and  5j   1C\  E[4,lc),  l^,lc\     etc.  are  U-statistics  which 
are  summarized  in  the  following  table. 
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Table  4.1   Summarizing  the  U-s tat  is  tics  Used  in  Estimatinj 
the  Covariance  for  lo 


U-Statistic 


p(3,lc) 
-(4,1c) 

p(4,lc) 
:(3,3c) 

gl 

:(4,3c) 

gl 
p(5,2c) 

gl 

p(5,3c) 


Kernal 
h(3,lc) 

hi(4,lc 

h2(4,lc 

h(3,3c) 

h(4,3c) 

h(5,2c) 

h(5,3c) 


,•) 


) 
) 


Conditions  on  the  6's 
5i=6.=l 
6i  =  6j=6k  =  l 

6i  =  l  ,6je(2,3) 
6i=6j=l ,6ke(2,3) 
(61,6J)e(213) 
6i-l,5je(2,3) 


Proof  : 

Let  the  estimator  defined  in  Lemma  4.5.2  be  denoted  by 

Cov(TEn   _  ,CD  ).   Notice  that 
nl  '  c 

^    *  *  *    J/o        A      Vo    r4(ni"2)     ^(4    lc) 

{Cov(TE  ,CD    )    -    Kl(4Y)    *   (3X^2   [-^j-    C    4'lc) 

1       c  l 

.       4nc      M4.3C), 
(n-1)     S  -I 


*    J./      /s      1/       2(n    ~     1) 
-    K9(4Y)     /2(X?)/2[        ,;_„        , 


2n 


_c 11   M5,2c)  1       M5,3c) 

(n-1)       ?j  +    (n-1)     S 


-£*    0. 


Thus,  if  we  can  show  that 
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4(n  -  2)  A  ,,  .  N 
1       (4 ,1c) 

(n-1)   S 


4n 


c   *(4,3c) 


(n-1)  "1 


is  a  consistent  estimator  for 

4Cov{¥(X.  -  X..+  X..-  X..),  a.,  b..  lfi.-6.-l}, 

1    1 1    2i    1  j    2j     lk  lk  i  i   j   ' 


and  that 


2(nc-l)  ^(5>2c)     2nl      ^(5>3c) 
(n-1)   S        +  (n-1)  ^ 


is  a  consistent  estimator  for 


2Covfl-2y.,  a.  . b.  .  6  .  e(2,3)}  , 


the  proof  will  be  complete.   Recalling  that 

nl   P  nc   P 

—  — >  X,     and  —  — *  X „ ,  we  observe  that  asymptotically, 

n         In         2  J 

the  coefficient  in  front  of  each  z,    term  is  the  probability 
necessary  to  uncondition  the  term  so  that  within  each 
estimator  the  terms  can  be  combined  appropriately.   (See 
page  118  for  a  similar  argument.)   Also,  since  each 
estimator  is  a  U-statistic,  it  follows  that  each  is 
consistently  estimating  the  appropriate  covariance  term  and 


the  proof  of  Lemma  4.5.2  follows. 


D 


CHAPTER  FIVE 
MONTE  CARLO  RESULTS  AND  CONCLUSION 

5  .  1  Int  roduction 


The  first  three  chapters  of  this  dissertation  have  been 
devoted  to  developing  tests  statistics  for  the  purpose  of 
testing  for  scale  differences  in  censored  matched  pairs. 
Chapter  Four  used  the  statistics  proposed  in  Chapters  Two 
and  Three  to  develop  a  vector  of  statistics  designed  to  test 
for  the  more  general  alternative  of  location  and/or  scale 
differences.   This  chapter  will  investigate  the  performance 
of  some  of  the  test  statistics  proposed. 

In  Section  5.2,  a  simulation  study  will  be  presented  to 
compare  the  intermediate  sample  size  performance  of  some 
members  of  the  proposed  class  of  statistics  presented  in 
Chapter  Three  and  the  CD  statistic  of  Chapter  Two. 
Similarly,  Section  5.3  will  present  a  simulation  study  for 
selected  W   vectors  and  a  test  statistic  proposed  by  Seigel 
and  Podger  (1982).   In  each  of  these  simulation  studies,  the 
asymptotic  distribution  of  each  test  statistic  is  being 
used  . 
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134 
5.2  Monte  Carlo  for  the  Scale  Test 

In  this  section,  nine  statistics  will  be  investigated 
to  compare  their  performance  under  the  null  and  alternative 
hypotheses.   Three  of  the  nine,  are  versions  of  the  CD 
statistic,  where  each  version  uses  a  different  estimator  for 
the  variance  of  CD  (Section  2.4).   The  next  three  statistics 
are  members  of  the  class  of  statistics  proposed  in  Section 
3.2,  where  the  common  location  parameter  was  known,  while 
the  following  three  are  members  of  the  class  of  statistics 
proposed  in  Section  3.3,  where  the  common  location  parameter 
was  unknown  and  thus  estimated.   Table  5.1  summarizes  the 
nine  statistics  considered  in  this  Monte  Carlo. 

In  this  Monte  Carlo  three  bivariate  distributions  (each 
with  common  location  (0,0))  were  considered  for  generating 
the  bivariate  samples.  Although,  generally  the  common 
location  is  not  (0,0),  it  was  used  without  loss  of 
generality.   The  first  distribution,  the  bivariate  normal, 
was  generated  using  the  subroutine  GGNSM  of  the 
International  Mathematical  and  Statistical  Library  (IMSL). 
It  allows  specification  of  the  var iance-covariance  structure 
for  the  bivariate  pairs.   The  remaining  two  distributions 
were  generated  using  a  technique  for  generating  elliptically 
symmetric  distributions  proposed  in  Johnson  and  Ramberg 
(1977).   The  two  distributions  are  both  Pearson  Type  VII 
multivariate  distributions  which  were  generated  in  the 
following  manner.   For  a  sample  of  size  n,  2n  uniform  [0,1] 
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Table  5.1  Summary  of  the  Test  Statistics  Considered  in  the 
Monte  Carlo  for  Scale.* 

Test  Section 

Statistic           Description  in  Thesis 

1  CD  statistic  using  Var^CD)  2.4 

2  CD  statistic  using  Var2(CD)  2.4 

3  CD  statistic  using  Var3(CD)  2.4 

4  Tnpnc=  W    Tnc  (  *  '  e  '  ■  L  1  =  L2  =  1  >  3'2 

5  W  2X+  Tnc  (L«.  .^-2.^-1)  3.2 


6  Tnpnc=  Tni+  2Tnc  <  i  •  e  •  '  L  1 "  1  '  L2  =  2  } 

7  ™n1,nc=  %<*>  +  Tnc  Ci  •••  .^-1  ,L2-1) 

8  ™ni.«c"  2^1(M)  +  Tnc  ^•e"Ll-2'L2-1) 

9  TMn   n  =  Tn  (M)  +  2Tn   ( i . e . ,L1»1 , L2«2 )    3.3 


3.2 
3.3 
3.3 


*   Tn,(M)  denotes  the  Tn   statistic  of  Section  3.3  which 
uses  an  estimate  for  the  common  location  parameter. 


random  variables  (denoted  U±  i  =  l  ,  2  ,  .  .  .  , 2n)  were  first 
generated  and  then  the  following  transformations  were 
applied : 

Xli  =  CU2i-l1/U"V)  "  1)/2  cos(2TrU2i) 
X2*  =  (U2i-1l/(1"v)  "  1)/2  sin(2TrU2i) 
where  the  parameter  v  (for  bivariate  pairs,  v  >  1)  specifies 
a  particular  Pearson  Type  VII  distribution.   To  generate 
bivariate  pairs  with  scale  parameters  a1  and  a 2  and 
correlation  p,  the  following  transformation  was  applied: 
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X1±-  o1X11 


and 


2  V2    * 


^2i=  P°2^1i  +  a 2^  ~  P  )   ^2i  " 
The  values  of  v  which  were  chosen  were  v  =  1.5  (which 

corresponds  to  a  bivariate  Cauchy  distribution)  and  v    =  3  (a 

distribution  with  moments  and  moderate  tailweight). 

To  generate  the  censoring  random  variable,  the  natural 
logarithm  of  the  Uniform  [0,B]  distribution  was  used.   The 
choice  of  B  was  made  separately  for  each  distribution  with 
specific  correlation  p,  so  that  under  H   approximately  25% 
of  the  total  sample  was  censored  in  some  manner.   Three 
values  for  p  were  chosen:  (1)  p  =  .2  (weak  correlation), 
(2)  p  =  .5  (moderate  correlation)  and  p  =  .8  (strong 
correlation).   Note  that  the  value  of  p  affects  the  type  of 
censoring  occurring  in  the  samples,  that  is,  when  p  =  .8, 
type  4  pairs  dominate  the  observations  which  are  censored, 
while  when  p  =  .2,  type  2  and  3  pairs  dominate  the 
observations  which  are  censored.   Since  the  results 
presented  in  this  section  apply  to  the  pattern  of  censoring 
described  above,  any  conclusions  drawn  only  apply  to  this 
form  of  censoring. 

In  each  case  the  null  hypothesis  was  H  :  a,  =  Oo  and 
the  alternative  was  H  :  Oo  >  o, .   The  tests  were  conducted 
at  the  .05  level  of  significance  using  the  asymptotic 
distribution  for  each  test  statistic.   The  first  Monte  Carlo 
study  consisted  of  generating  1000  independent  censored 
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samples  of  size  25,  while  the  second  utilized  1000 

2 
independent  samples  of  size  40.   In  each,  the  value  of  Oi 

2 
was  1.0  while  the  value  of  Oo  was  1.0  (under  H  )  or  2.0  or 

3.0  (each  value  corresponding  to  a  different  run  of  the 

fortran  program  listed  in  Appendix  2).   Tables  5.2-5.4  give 

the  results  of  the  Monte  Carlo  for  each  distribution  type 

with  entries  corresponding  to  the  number  of  times,  a 

statistic  rejected  H  .   The  nine  test  statistics  are 
J        o 

numbered  in  accordance  with  the  listing  in  Table  5.1.   The 
standard  deviation  associated  with  each  entry  e  can  be 
estimated  by  ( e ( 1 000-e ) / 1 000 ) ^. 

Inspecting  Tables  5.2-5.4,  we  see  that  as  the 
correlation  increases  between  the  components  in  the 
bivariate  pairs,  that  the  power  increases  for  all  the  tests, 
regardless  of  the  distribution  considered.   This  exhibits 
the  fact  that  the  tests  were  designed  to  use  the  intra-pair 
information  or  at  least  some  of  it,  in  the  case  of  the 
distribution-free  statistics  which  correspond  to  columns 
4-9.   The  CD  statistics  (columns  1-3)  which  use  more 
intra-pair  and  inter-pair  information  than  the  distribution- 
free  statistics  are  performing  the  best  across  all  of  the 
distributions  considered. 

Recall,  the  only  difference  in  the  CD  statistics 
(columns  1-3)  is  the  method  of  estimating  the  variance.   The 
CD  statistic  corresponding  to  column  2  which  uses  Var2(CD) 
(an  estimate  for  the  asymptotic  variance  (4y)  which  is  the 
variance  of  a  conditional  expectation,  page  41  in  this 
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Table  5.2   Approximate  Powers  of  the  Tests  for  the  Bivariate 
Normal  Distribution 


P  " 


n 

2 
°2 

1  .0 

1 
41 

2 
49 

3 

42 

4 
45 

5 

45 

6 
44 

7 
40 

8 

9 

25 

40 

42 

25 

2.0 

312 

372 

337 

294 

268 

307 

287 

258 

292 

25 

3.0 

561 

619 

584 

526 

489 

553 

506 

471 

541 

40 

1.0 

35 

44 

34 

41 

39 

42 

45 

44 

41 

40 

2.0 

489 

523 

492 

412 

369 

445 

400 

361 

443 

40 

3.0 

790 

809 

795 

704 

665 

732 

699 

644 

731 

P-.5 


n 

1  .0 

1 

46 

2 

55 

3 

45 

4 
51 

5 

55 

6 

47 

7 
53 

8 

9 

25 

54 

48 

25 

2.0 

371 

429 

388 

342 

325 

358 

331 

309 

344 

25 

3.0 

630 

688 

657 

608 

568 

642 

590 

543 

615 

40 

1.0 

42 

47 

42 

47 

46 

47 

49 

51 

47 

40 

2.0 

538 

575 

547 

476 

445 

516 

475 

443 

517 

40 

3.0 

869 

885 

871 

780 

736 

835 

765 

721 

816 

n 

2 
a2 

1.0 

1 

56 

2 

61 

3 

47 

4 

53 

5 

56 

6 

53 

7 

55 

8 

9 

25 

54 

53 

25 

2.0 

546 

597 

549 

507 

482 

529 

495 

477 

528 

25 

3.0 

831 

877 

855 

796 

773 

827 

774 

747 

799 

40 

1.0 

50 

56 

46 

45 

48 

46 

50 

52 

48 

40 

2.0 

770 

795 

772 

671 

655 

719 

671 

646 

711 

40 

3.0 

974 

979 

977 

931 

921 

947 

914 

893 

935 
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Table  5.3   Approximate  Powers  of  the  Tests  for  the  Bivariate 
Cauchy  Distribution  (Pearson  Type  VII,  v=1.5) 


P  =  -2 


n 

2 
°2 

1  .0 

1 

48 

2 
64 

3 
57 

4 
50 

5 

46 

6 

51 

7 

56 

8 

9 

25 

57 

54 

25 

2.0 

222 

294 

269 

273 

269 

275 

252 

244 

262 

25 

3.0 

422 

521 

491 

468 

451 

460 

441 

431 

444 

40 

1.0 

38 

50 

46 

49 

48 

48 

54 

52 

51 

40 

2.0 

353 

399 

383 

365 

355 

362 

348 

338 

350 

40 

3.0 

632 

684 

665 

637 

611 

639 

613 

589 

623 

P-.5 


n 

a2 

1  .0 

1 

46 

2 

67 

3 

58 

4 

52 

5 

51 

6 

56 

7 
56 

8 

9 

25 

46 

58 

25 

2  .0 

272 

343 

324 

307 

296 

327 

296 

281 

312 

25 

3.0 

495 

587 

571 

567 

537 

564 

516 

503 

532 

40 

1.0 

40 

51 

48 

45 

47 

42 

38 

41 

39 

40 

2.0 

424 

472 

458 

451 

424 

453 

410 

401 

430 

40 

3.0 

735 

773 

764 

741 

721 

757 

718 

682 

724 

p-.i 


n 

2 
°2 

1  .0 

1 

42 

2 

61 

3 

49 

4 

55 

5 
53 

6 

53 

7 

55 

8 

9 

25 

54 

57 

25 

2.0 

405 

492 

467 

450 

438 

467 

424 

420 

436 

25 

3.0 

679 

778 

751 

762 

735 

770 

718 

698 

739 

40 

1.0 

49 

63 

57 

55 

54 

60 

58 

59 

62 

40 

2.0 

612 

665 

644 

656 

642 

675 

620 

601 

640 

40 

3.0 

907 

925 

924 

926 

917 

933 

908 

895 

920 
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Table  5.4   Approximate  Powers  of  the  Tests  for  the  Bivariate 
Pearson  Type  VII,  v=3 


=  .2 


n 

2 
°2 

1  .0 

1 

52 

2 
69 

3 

56 

4 
55 

5 

54 

6 

59 

7 

54 

8 

9 

25 

54 

53 

25 

2.0 

262 

330 

288 

268 

255 

280 

263 

242 

276 

25 

3.0 

483 

562 

514 

492 

470 

513 

464 

444 

492 

40 

1.0 

41 

46 

43 

46 

46 

47 

53 

51 

51 

40 

2.0 

420 

460 

428 

380 

372 

392 

364 

350 

382 

40 

3.0 

737 

770 

753 

672 

649 

711 

655 

623 

684 

P-.5 


n 

2 
1  .0 

1 
48 

2 

62 

3 
52 

4 

50 

5 

52 

6 

52 

7 
59 

8 

9 

25 

61 

58 

25 

2.0 

319 

380 

350 

321 

314 

330 

298 

290 

315 

25 

3.0 

567 

648 

618 

574 

554 

599 

552 

527 

570 

40 

1.0 

46 

54 

48 

48 

47 

49 

49 

53 

47 

40 

2.0 

497 

540 

506 

452 

437 

477 

446 

426 

461 

40 

3.0 

802 

833 

812 

767 

743 

790 

749 

719 

767 

p  =  < 


n 

2 
a2 

1  .0 

1 
49 

2 
66 

3 

51 

4 

54 

5 

53 

6 

54 

7 
54 

8 

9 

25 

54 

54 

25 

2.0 

484 

555 

513 

504 

497 

516 

484 

464 

490 

25 

3.0 

782 

835 

816 

769 

760 

791 

754 

744 

775 

40 

1.0 

44 

56 

48 

53 

54 

54 

63 

64 

62 

40 

2.0 

713 

742 

725 

685 

674 

695 

649 

642 

663 

40 

3.0 

953 

964 

958 

941 

927 

947 

933 

920 

942 
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thesis)  has  some  problem  maintaining  the  significance  level 

under  H   and  thus  is  not  recommended.   The  CD  statistics 
o 

corresponding  to  columns  1  and  3  are  maintaining  the  .05 
level  under  H  ,  with  the  statistic  corresponding  to  column  3 
performing  the  best  over  the  alternatives.   Recall,  the  CD 
statistic  which  uses  Var^(CD)  (column  1)  is  using  a 
U-statistic  to  estimate  the  asymptotic  variance  (4y),  while 

the  CD  statistic  which  uses  Var^CCD)  (column  3)  is 

2       .  4(n-2) 

estimating  the  exact  variance  that  is   — -, rr  a  +  — 1 TT  Y  • 

°  n(n-l)      n(n-l; 

For  the  CD  statistics,  only  once  was  there  a  negative 

estimate  for  the  variance  which  occurred  for  the  Pearson 

2 
Type  VII,  v=3  distribution  with  n=25  and  o2  =  2.0  .   One 

basic  disadvantage  of  the  CD  statistics  is  the  fact  they 

require  the  use  of  a  computer  to  perform  the  calculations 

for  even  moderate  sample  sizes.   The  CDSTAT  subroutine  of 

the  fortran  program  listed  in  Appendix  2  could  be  used.   If 

a  computer  is  not  available  or  the  necessary  knowledge  to 

use  it  to  program  the  calculations  for  the  CD  statistic, 

then  a  distribution-free  test  could  be  recommended. 

Of  the  distribution-free  tests  in  columns  4-9,  the 

tests  which  uses  weights  of  L,=l  and  L2=2  (columns  6  and  9) 

appear  to  be  performing  the  best.   Column  6  corresponds  to 

the  statistic  for  the  case  when  the  common  location 

parameter  is  known,  while  column  9  corresponds  to  the 

statistic  for  the  case  when  the  common  location  parameter  is 

unknown.   The  corresponding  test  statistics  using  equal 

weights,  that  is,  L,=  Lo"  1  (column  4  corresponds  to  the 
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case  of  the  known  location  parameter,  while  column  7 
corresponds  to  the  case  of  the  unknown  location  parameter) 
follow  closely  behind  in  terms  of  power.   If  the  common 
location  is  known,  in  each  case  the  power  is  improved  by 
using  that  known  value  (column  6  for  weights  L,=l  and  L2=  2 
or  column  4  for  equal  weighting).   When  the  correlation  is 
high,  note  that  all  the  statistics  considered  in  this  Monte 
Carlo  are  performing  well. 

In  summary,  the  best  statistic  to  use  is  CD  using 
Varo(CD)  when  the  necessary  computations  which  require  a 
computer  can  be  done.   If  it  is  not  possible  to  calculate 
the  CD  statistic,  the  distribution-free  test  statistic  usinj 
weights  of  L,=  1  and  l>?  =    2  (corresponding  to  column  6)  is 
recommended  when  the  location  parameter  is  known.   If  the 
location  parameter  is  unknown,  then  the  distribution-free 
test  statistic  using  weights  of  L,=  1  and  L2=  2 
(corresponding  to  column  9)  is  recommended. 


5.3   Monte  Carlo  for  the  Location/Scale  Test 

In  this  section,  twelve  statistics  will  be  investigated 
to  compare  their  performance  under  the  null  and  alternative 
hypotheses  considered  in  Chapter  Four.   The  alternative 
hypotheses  studied  here  include  alternatives  for  location 
differences  only,  for  scale  differences  only  and  for 
location  and  scale  differences.   The  first  nine  statistics 
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which  will  be  considered  are  various  versions  of  the 
quadratic  forms  presented  in  Chapter  Four.   The  tenth 
statistic  was  introduced  by  Seigel  and  Podger  (1982)  and 
will  be  defined  in  this  section.   The  last  two  statistics, 
which  are  from  Chapters  Two  and  Three,  are  included  here  to 
determine  their  performance  under  the  alternatives 
considered  in  this  section.   The  first  is  the  distribution- 
free  statistic  TM„   „   which  uses  weights  of  L,=  1  and  L9=  2 
n  i  » n  c  x  * 

(corresponding  to  statistic  9  in  Section  5.2).   The  second 
is  the  CD  statistic  which  uses  Var.j(CD)  to  estimate  the 
variance  of  CD  (corresponding  to  statistic  3  in  Section 
5.2). 

Of  the  nine  quadratic  forms  from  Chapter  Four,  the 


first  three  are  versions  of  W~   |o   ^2n  wnere  ^2n  """s  C 


he 


vector  of  statistics  which  uses  TE„   _   (which  tests  for 

nl  'nc 

location) 'and  TM_   _   (which  tests  for  scale)  with  the 
nl'c 

common  location  estimate.   The  three  different  versions 

correspond  to  different  choices  for  the  weights,  K^n,  K.2n, 

L,   and  L0  ,  used  in  forming  W0  .   The  next  three  quadratic 
In       Zn'  °~/n 

forms  are  versions  of  Wln  f,   Wln  where  Wln  is  the  vector  of 

statistics  which  uses  TE    _   and  T„   _   (which  tests  for 

nl'nc       nl'nc 

scale)  with  the  separate  location  estimates.   Similar  to  the 
first  three  quadratic  forms,  these  three  different  versions 
correspond  to  different  choices  for  the  weights  K^n,  K.2n, 
L,   and  L2n.   The  choices  of  weights  used  here  are  identical 
to  those  used  in  W0   and  will  be  defined  shortly.   The  last 
three  quadratic  forms  correspond  to  different  versions  of 
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"   i  - 1 
K,   I,   H,   where  W1n  is  the  vector  of  statistics  which  uses 
-jn  n   -in        ~  jn 

TE„   _   and  CD   (the  standardized  CD  statistic)  with 
nl  »nc 

Varo(CD)  as  an  estimate  of  the  variance  of  CD.   Again,  the 
three  versions  correspond  to  different  choices  for  the 
weights  K,   and  K«  .   For  the  weights  K.   ,  K^,  L^n  and  L2n' 
the  following  choices  were  used: 


(1)  K,  =  K~  =  L,  =  L2  =l ,  (This  amounts  to  just  summing 
the  standardized  statistics  to  form  the 

location  statistic  and  the  scale  statistic.) 

n  n 

(2)  K,  =  L,  =  — and  K9  =  L9  =  ~ .  (This 

In    In   n,+  n  2n    2n   n,+  n    ' 

l    c  l    c 

weights  each  statistic  proportionally  to  the 
sample  size  used  in  its  calculation  and  will  be 
denoted  as  SS  weights.) 
and 


a(TEn  ) 

(3)  Kln=  a(TE  " 

ni'nc 


,  K 


a(TE   ) 
n 
c 

2n"  a(TE       )  ' 

ni'nc 


L,  = 


(T   ) 
ni 


In     a(T      ) 
nl'nc 


and  K„  = 


o(T   ) 
n 
c 


2n    o(T      ) 
nl'nc 


where  a(TEn  ),  a(TEn  )  and  a(TE_     ) 
nl        nc  nl'nc 

represent  the  standard  deviations  of  TE„  ,  TE„   and 

nl     nc 

TE   +  TE   ,  respectively,  under  the  null 

hypothesis.   Similarly,  o(Tn  ),  o(Tn  )  and 

l        c 

a(T      )  represent  the  standard  deviations  of 
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and  T 


•„   „  >  respectively,  under  the  null 
"1    uc       nl'nc 

hypothesis.   (This  weights  each  statistic 
proportionally  to  its  null  standard  deviation  and 
will  be  denoted  as  STD  weights.) 


For  the  quadratic  forms  using  CD   (i.e.,  versions  of 

Wq   to   Wo  )  only  the  weights  K,   and  K~   were  used  since 

-jn  Tj   . jn      J  °  In       ^n 

the  definition  of  CD   did  not  include  any  weights.   For  the 
quadratic  forms  based  on  W0   and  W,„.  the  three  choices  for 
the  weights,  K,  ,  K2  ,  L,   and  L»  ,  correspond  to  the  three 
versions  of  each  statistic  which  will  be  presented.   Table 
5.5  summarizes  the  twelve  statistics  considered  in  this 
Monte  Carlo . 

The  test  which  was  proposed  by  Seigel  and  Podger  (1982) 
can  be  viewed  as  a  special  case  of  the  procedure  commonly 
referred  to  as  the  log  rank  method  proposed  by  Mantel 
(1966).   This  test  assumes  the  null  hypothesis  that  the 
survival  curve  for  X,,  is  identical  to  that  of  Xjl    at  a^ 
points.   The  alternative  for  this  test  is  that  differences 
exist  between  the  two  curves.   Note,  that  this  is  a  more 
general  alternative  than  the  alternatives  specified  in 
Chapter  Two  and  Three  but  the  test  was  included  to  determine 
how  well  it  performed  for  the  alternatives  considered 
here.   Now,  to  define  the  test  statistic.   Let  n  represent 
the   total  number  of  type  1,  2,  and  3  pairs  and  let  n   be 
the  number  of  pairs  for  which  X,  .  >  ^-n±'       Similarly,  define 
n   to  be  the  number  of  pairs  in  which  X.  .  <  X-ji'       Note  that 


146 


Table  5.5   Summary  of  the  Test  Statistics  Considered  in 
Monte  Carlo  for  Location  and/or  Scale 
Alternatives  . 

Test  Section 

Statistic           Description  in  Thesi; 

1  ?2n  h1    ?2n  with  Kln=K2n  =  Lln  =  L2n  =  1  4'2 

2  W^  ^2*  W2n  with  ss  weights  4.2 

3  W2n  fa1    W2n  with  STD  weights  4.2 

4  W[n  JI1  "la  with  Kln=K2n=Lln=L2n=1  4'2 

5  W^n  Ij1  Wln  with  SS  weights  4.2 

6  w[n  l^1  Wln  with  STD  weights  4.2 

7  ?3n  l~3l    ?3n  with  Kln=K2n  =  1  4'3 

8  W3n  li1  W3n  with  STD  weights  4.3 

9  W^n  $2^    w3n  with  SS  weights  4.3 

10  Siegel  Podger  Statistic  5.3 

11  TM       =  T   (M)  +  2T  3.3 

n1,nc     nx         nc 

12  CD  statistic  using  Var3(CD)  2.4 


n   +  n   =  n.   The  test  proposed  by  Seigel  and  Podger  (1982) 
is  to  compare  the  observed  frequencies  n   and  n  ,  against 
the  expected  values  of  n/2  (under  HQ)  using  the  binomial 
distribution  or  when  appropriate,  an  approximate  large 
sample  distribution.   A  suggested  statistic  which  would  be 
appropriate  for  the  approximation  is  McNemar's  statistic 
which  could  be  defined  here  as 

TSP  *  (nt  "  nu)2/n« 
2 
Under  HQ ,  Tsp  has  a  limiting  Xfn  distribution. 
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For  this  Monte  Carlo  study,  the  uncensored  bivariate 
samples  were  generated  using  the  techniques  described  in 
Section  5.2.   Those  techniques  generate  bivariate  pairs  with 
scale  parameters  Oi  and  ao  and  correlation  p.   Without  loss 
of  generality,  the  location  parameter  for  X2j  was  chosen  to 
be  0,  while  location  parameter  for  X..  was  \i  2  >       T^e 
censoring  random  variables  were  also  generated  in  the  same 
manner  as  in  Section  5.2,  so  that  under  H   approximately  25% 
of  the  total  sample  was  censored  in  some  manner.   The  values 
for  p  considered  in  this  section  are  .2,  .5  and  .8  (as  in 
Section  5.2). 

For  each  Monte  Carlo  run  the  null  hypothesis  was 


H 


jj ,  =  y~  and  o,  =  On,    and  the  alternative  was 


H  :   y,  <f    \i  2    and/or  o,  ^  o,.   The  tests  were  conducted  at 
the  .05  level  of  significance  using  the  asymptotic 
distribution  for  each  test  statistic  plus  consistent 
variance-cova riance  estimators  defined  in  Section  4.5,  where 
appropriate.   This  study  consisted  of  generating  500 
independent  censored  samples  of  size  35.   To  produce  the 
alternatives  considered  here,  the  value  of  a2  was  1.0,  while 
the  value  of  a2  was  1.0,  2.0,  or  3.0.   Similarly,  the  value 
of  y2  was  0»  while  the  value  of  y  ,  was  0.5  or  1.0.   Only 
seven  alternatives  (in  addition  to  the  null  hypothesis), 
were  chosen  for  this  study  and  correspond  to  the  following 
choices  for  u,  and  a2: 


(1)  y x=  0.0  and  a2  =  2.0, 

(2)  yx=  0.0  and  a2  =  3.0, 


and 
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(3)  \ii"    0.5  and  a2  ■  1.0, 

(4)  Ml=  1.0  and  a2  =  1.0, 

(5)  y:=  0.5  and  a2  =  2.0, 

(6)  pj-  0.5  and  a2  =  3.0 

(7)  y,=  1.0  and  a2  =  3.0. 


Tables  5.6-5.8  give  the  results  of  the  Monte  Carlo  for  each 

distribution  type  with  entries  corresponding  to  the  number 

of  times  a  statistic  rejected  H  .   The  twelve  statistics  are 

-"        o 

numbered  in  accordance  with  the  listing  in  Table  5.5.   The 
column  headings  Null,  Scale,  Location  and  Location/Scale 
refer  to  the  type  of  bivariate  pairs  being  generated  and  the 
type  of  alternative  that  it  reflects.   The  standard 
deviation  associated  with  each  entry  e  can  be  estimated  by 
(e(500  -  e)/500)'2  .   Covariance  estimates  between  certain 
entries  were  also  estimated  but  are  not  reported  here.   The 
following  discussion  refers  only  to  statistics  1  to  10  in 
Tables  5.6-5.8.   The  discussion  of  statistics  11  and  12  will 
follow. 

Inspecting  Tables  5.6-5.8,  we  see  that  as  the 
correlation  increases  between  the  components  in  the 
bivariate  pairs  that  the  power  increases  for  all  the 
tests.   Similar  results  were  observed  in  the  Monte  Carlo  in 
Section  5.2.   For  the  null  hypothesis  (column  1),  all  the 
tests  are  maintaining  the  significance  level  under  HQ  fairly 
well,  although,  the  levels  for  the  tests  using  CD  show  more 
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Table  5.6 


Number  of  Rejections  in  500  Replications  for  the 
Bivariate  Normal  Distribution 


Null 

Sea 

le 

Location 
0.5    1.0 

Loca  t  ion/ S 
0.5    0.5 

cale 

u 

0.0 

0.0 

1  .0 

Stat  a 

I-     1.0 

25 

2.0 
110 

3.0 

1  .0 
P-.2 

1  .0 

2.0 
258 

3.0 
333 

3.0 

1 

241 

220 

467 

451 

2 

33 

107 

223 

220 

461 

259 

330 

453 

3 

29 

97 

203 

153 

409 

231 

295 

397 

4 

30 

105 

235 

224 

471 

230 

293 

434 

5 

35 

108 

233 

230 

470 

239 

299 

437 

6 

40 

104 

216 

188 

436 

228 

288 

404 

7 

32 

157 

299 

231 

472 

268 

369 

459 

8 

34 

148 

294 

233 

466 

266 

370 

459 

9 

33 

140 

289 

179 

418 

221 

314 

400 

10 

24 

26 

29 

227 

465 

204 

189 

412 

11 

22 

177 

343 

21 

7 

126 

243 

6  0 

12 

18 

222 

379 

7 

2 

125 

266 

128 

P--5 


1 

25 

124 

302 

295 

494 

355 

409 

488 

2 

30 

123 

287 

300 

493 

352 

408 

486 

3 

31 

115 

256 

233 

477 

325 

391 

470 

4 

25 

127 

298 

290 

493 

322 

388 

481 

5 

36 

129 

278 

304 

496 

332 

383 

481 

6 

40 

122 

263 

278 

481 

315 

368 

472 

7 

36 

185 

361 

308 

493 

366 

427 

489 

8 

35 

181 

363 

311 

494 

364 

425 

487 

9 

32 

180 

355 

257 

481 

319 

400 

471 

10 

23 

25 

35 

295 

492 

278 

265 

469 

11 

28 

226 

384 

45 

19 

209 

327 

124 

12 

19 

268 

403 

5 

3 

156 

296 

149 

150 


Table  5.6  -  continued. 


Null 

Sea 

le 

Location 
0.5    1.0 

Loca t  ion/£ 

cale 

u 

0.0 

0.0 

0.5 

0.5 

1  .0 

Stat  a 

1-   1.0 

28 

2.0 
239 

3.0 

1  .0 
P-.8 

1  .0 

2.0 
475 

3.0 
487 

3.0 

1 

405 

460 

500 

500 

2 

33 

243 

399 

466 

500 

477 

485 

500 

3 

30 

214 

375 

434 

500 

472 

483 

500 

4 

22 

231 

418 

455 

500 

463 

483 

500 

5 

37 

234 

394 

470 

500 

469 

482 

500 

6 

41 

215* 

368 

448 

500 

465 

482 

500 

7 

39 

307 

460 

461 

500 

478 

493 

500 

8 

41 

303 

459 

469 

500 

480 

492 

500 

9 

38 

307 

454 

439 

500 

469 

488 

500 

10 

31 

37 

45 

459 

500 

440 

414 

500 

11 

25 

322 

445 

74 

31 

347 

445 

198 

12 

22 

361 

477 

7 

2 

230 

413 

236 

indicates  one  covariance  matrix  was  not  positive  definite 


Table  5.7   Number  of  Rejections  in  500  Replications  for  the 
Bivariate  Cauchy  Distribution 


Null 

i-    °-° 

Sea 

le 

Loca t  ion 
0.5    1.0 

Loca 
0.5 

tion/S 
0.5 

cale 

u 

0.0 

0.0 

1  .0 

Stat   a 

I-    i.o 

28 

2.0 

92 

3.0 

1  .0 
P-.2 

1  .0 

2.0 
167 

3.0 
258 

3.0 

1 

197 

98 

304 

354 

2 

31 

99 

204 

113 

317 

175 

260 

355 

3 

33 

95 

193 

107 

284 

170 

245 

327* 

4 

33 

96 

197 

97 

306 

167 

243 

337 

5 

36 

105 

206 

115 

333 

176 

250 

348 

6 

37 

90 

199 

130 

320 

177 

242 

330 

7 

42 

121 

244 

114 

328 

173 

269 

344 

8 

38 

123 

244 

120 

348 

179 

269 

351 

9 

37 

121 

245 

122 

331 

167 

254 

334 

10 

26 

27 

29 

156 

368 

141 

137 

320 

11 

35 

169 

298 

27 

18 

152 

266 

181 

12 

35 

198 

322 

26 

17 

159 

280 

212 

151 


Table  5.7  -  continued 


Null 

Sea 

le 

Location 
0.5    1.0 

Loca  t  ion/ S 
0.5    0.5 

>  cale 

M 

0.0 

0.0 

1  .0 

Stat  a 

I-    i.o 

22 

2.0 
106 

3.0 

1  .0 

p=.5 

1  .0 

2.0 
222 

3.0 
308 

3.0 

1 

224 

140 

374 

421 

2 

22 

117 

233 

158 

390 

232 

306 

419 

3 

30 

110 

209 

141* 

360* 

219* 

305 

391* 

4 

24 

112 

230 

142 

377 

203 

289 

397 

5 

30 

114 

235 

164 

402 

216 

286 

406 

6 

40 

110 

221 

176* 

417* 

216 

293 

394 

7 

32 

156 

280 

150 

385 

224 

309 

401 

8 

33 

152 

273 

171 

399 

235 

316 

406 

9 

35 

150 

269 

173 

412 

210 

309 

394 

10 

28 

30 

34 

196 

430 

204 

183 

400 

11 

24 

190 

347 

35 

27 

192 

323 

243 

12 

32 

215 

364 

25 

14 

165 

310 

230 

p  = 


1 

15 

167 

334 

246 

449 

367 

432 

484 

2 

20 

170 

338 

281 

471 

366 

430 

484 

3 

29* 

158 

* 
312 

270* 

459 

** 
363 

429 

472 

4 

21 

165 

325 

250 

455 

355 

413 

479 

5 

27 

170 

342 

312 

476 

366 

418 

480 

6 

39 

159* 

322* 

315 

486 

361 

423 

477 

7 

45 

223 

394 

271 

464 

342 

420 

474 

8 

36 

228 

390 

328 

478 

363 

427 

480 

9 

34 

226 

389 

319 

483 

355 

424 

473 

10 

26 

31 

54 

361 

489 

368 

336 

487 

11 

29 

279 

435 

48 

55 

299 

435 

338 

12 

32 

297 

446 

27 

13 

243 

401 

307 

*  indicates  one  covariance  matrix  was  not  positive  definite 
**  indicates  two  covariance  matrices  were  not  positive 
definite 


152 

Table  5.8   Number  of  Rejections  in  500  Replications  for  the 
Bivariate  Pearson  VII,  v=3 


Null 


Scale 


Location     Loca t ion/ S cale 


y  =   0.0    0.0    0.0    0.5    1.0    0.5    0.5    1.0 

Stat  a2=       1.0    2.0    3.0    1.0    1.0    2.0    3.0    3.0 
2   


P-.2 


1 

28 

99 

224 

220 

462 

263 

314 

445 

2 

32 

102 

223 

219 

461 

262 

313 

445 

3 

34 

102 

208 

168 

416 

223 

286 

405 

4 

29 

104 

222 

212 

472 

248 

285 

440 

5 

37 

103 

221 

223 

472 

257 

290 

439 

6 

39 

106 

217 

201 

453 

234 

277 

413 

7 

38 

139 

277 

227 

469 

275 

339 

452 

8 

38 

137 

273 

229 

468 

274 

334 

450 

9 

39 

134 

264 

199 

443 

221 

293 

409 

10 

30 

26 

29 

227 

469 

200 

193 

427 

11 

27 

179 

320 

26 

7 

144 

236 

69 

12 

32 

198 

355 

16 

4 

130 

253 

132 

P--5 


1 

25 

112 

261 

296 

491 

338 

402 

488 

2 

26 

122 

253 

297 

491 

337 

399 

487 

3 

29 

115 

226 

230 

467 

297 

377 

471 

4 

25 

118 

249 

301 

495 

308 

378 

484 

5 

33 

123 

251 

314 

495 

313 

379 

486 

6 

35 

113 

229 

275 

484 

292 

361 

472 

7 

32 

173 

329 

309 

494 

341 

408 

487 

8 

33 

167 

324 

313 

495 

341 

407 

489 

9 

35 

166 

315 

276 

486 

297 

379 

464 

10 

28 

27 

27 

324 

495 

288 

270 

486 

11 

35 

208 

358 

38 

20 

205 

303 

120 

12 

28 

240 

391 

14 

5 

145 

278 

143 

153 


Table  5.8  -  continued. 


Null 

i-    °-° 

Sea 

le 

Loca 
0.5 

t  ion 
1  .0 

Loca  t  ion/S 
0.5    0.5 

cale 

V 

0.0 

0.0 

1  .0 

Stat   o 

|-    i.o 

22 

2.0 
201 

3.0 

1  .0 
P-.8 

1  .0 

2.0 
476 

3.0 
486 

3.0 

1 

371 

447 

500 

499 

2 

24 

203 

368 

455 

500 

475 

485 

499 

3 

30 

190* 

335* 

414 

500 

461 

477 

496 

4 

17 

182 

368 

445 

500 

461 

478 

499 

5 

23 

189 

364 

467 

500 

467 

479 

499 

6 

33 

* 
185 

329 

446 

500 

463 

476 

497 

7 

38 

267 

432 

450 

500 

466 

435 

499 

8 

34 

266 

430 

463 

500 

475 

486 

499 

9 

37 

257 

425 

442 

500 

458 

474 

497 

10 

25 

28 

41 

463 

500 

454 

422 

499 

11 

34 

305 

441 

58 

31 

309 

421 

189 

12 

38 

325 

467 

17 

7 

217 

384 

222 

*  indicates  one  covariance  matrix  was  not  positive  definite 


fluctuation  than  the  others.   Of  the  CD  tests,  the 
fluctuation  of  the  null  power  is  greatest  for  the  statistic 
in  row  7  where  it  varies  from  32  rejects  (for  the  normal 
distribution  with  p=.2)  all  the  way  to  45  rejects  (for  the 
cauchy  distribution  with  p=.8). 

For  the  alternatives  in  which  the  bivariate  pairs  were 
generated  with  scale  differences  only  (i.e.,  p^=  0.0  and 
a2  =  2.0),  or  p^  0.0  and  a2,  =  3.0),  the  first  obvious 
conclusion  is  that  the  Seigel  Podger  statistic  (row  10)  is 
definitely  not  appropriate  because  it  tests  for  location  and 
not  scale  differences.   As  one  would  expect  from  the  results 
in  Section  5.2,  the  quadratic  forms  using  CD   as  a  scale 


154 

statistic  (rows  7-9)  are  performing  better  than  the 

quadratic  forms  using  the  distribution-free  scale  statistic 

TM       (rows  1-3)  or  its  analog  which  uses  the  separate 
nl»  c 

location  estimates  (rows  4-6).   No  basic  differences  exist 

between  the  three  quadratic  forms  which  use  CD  for  these  two 

alt erna  t  i ves  . 

For  the  alternatives  in  which  the  bivariate  pairs  were 

generated  with  location  differences  only  (y,=  0.5  and 

oj?=  1.0,  or  p  ,=  1.0  and  Oo=  1.0),  in  general,  the  Seigel 

Podger  statistic  (row  10)  is  performing  the  best  because  it 

specializes  in  this  type  of  alternatives.   If  we  look  at  the 

bivariate  Pearson  VII  distribution  with  v=3  and  the 

bivariate  normal  distribution  only,  the  statistics 

corresponding  to  rows  1,  2,  4,  5,  7  and  8  are  all  performing 

equivalently  to  the  Seigel  Podger  statistic.   It  is  in  the 

bivariate  cauchy  distribution  where  the  Seigel  Podger 

statistic  seems  to  have  a  slight  advantage.   Note  that  the 

statistics  corresponding  to  rows  1,  4  and  7  use  the  equal 

weighting  scheme  while  the  statistics  corresponding  to  rows 

2,  5  and  8  use  the  sample  size  weighting  scheme  in  forming 

the  corresponding  W   vector.   A  possible  reason  why  the 

standard  deviation  weighting  scheme  seems  to  diminish  a 

quadratic  form's  performance  is  that  the  variance  for  the 

term  TE    is  relatively  small  compared  to  the  variance  of 

TE„  .   Thus,  in  the  linear  combination  K,„TE„   +  K0„TE 
n ,  Inn,      znn' 

most  of  the  weight  is  being  given  to  TEn  .   Popovich  (1983 
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)  observed  in  his  Monte  Carlo,  that  this,  in  some  cases, 

reduces  the  performance  of  the  location  statistic. 

Now  we  consider  the  alternatives  in  which  the  bivariate 

pairs  were  generated  with  location  and  scale  differences 

{\iy=    0.5  and  a2  =  2.0,  or  u  i=    0.5  and  a2  =  3.0,  or  u^  1.0 

and  a2,  =  3.0).   For  these  alternatives,  the  statistics 

corresponding  to  rows  7  and  8  are  performing  the  best 

overall.   The  statistics  corresponding  to  rows  1  and  2  are 

performing  equivalently  to  rows  7  and  8,  except  for  the 

alternative  where  y,=  0.5  and  a2  =  3.0  for  the  bivariate 

Pearson  VII  distribution  with  v=3  and  p=.2,  and  the 

bivariate  normal  distribution  with  p-.2  or  p=.5  .   This 

possibly  is  reflecting  the  fact  that  CD  performs  slightly 

better  than  the  distribution-free  scale  statistic  TM 

n1,nc 

when  scale  differences  exist  and,  that,  for  this  alternative 

(i.e.,  y^=  0.5  and  a2  =  3.0)  large  scale  differences 

exist.   As  the  correlation  increases  within  any 

distribution,  we  see  that  all  the  statistics  are  performing 

moderately  well  and  that  for  the  last  alternative  where 

y^=  1.0  and  o2  =  3.0  when  p=.5  or  p=.8  no  differences  exist 

between  the  ten  statistics  in  general. 

In  summary,  the  recommended  statistic  is  the  quadratic 

form  which  uses  CD  and  the  sample  size  (SS)  weights  for 

TE       (row  8).   This  statistic  provides  the  best  power  in 

general  for  the  alternatives  considered  here.   The  statistic 

corresponding  to  the  quadratic  form  which  uses  CD  and  equal 

weights  for  TEn      (row  7)  performs  for  the  most  part 
1  '  c 
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equivalently ,  but  is  not  recommended  due  to  its  higher 
fluctuation  of  levels  under  the  null  hypothesis.   Note,  if 
instead  of  the  more  general  alternative  hypothesis 
considered  here  (i.e.,  Hfl :   y 1  ^  y2  and/or  a ^    ^    o2)»  we  can 
restrict  the  alternative  to  be  more  specific  (i.e., 
H  :  a 2    t    Oo)  then,  in  many  cases,  the  power  of  the  test  can 
be  improved  by  using  a  statistic  which  is  specifically 
designed  for  that  alternative. 

Finally,  we  turn  our  attention  to  the  last  two 
statistics  (11  and  12)  included  in  this  Monte  Carlo.   These 
statistics  were  included  for  two  reasons.   The  first  reason 
was  to  investigate  their  performance  when  the  bivariate 
pairs  were  generated  with  equal  marginal  scale  parameters 
but  unequal  marginal  locations.   The  second  reason  was  to 
determine  what  effect  unequal  marginal  locations  had  on  the 
power  of  the  tests  when  the  bivariate  pairs  had  unequal 
scale.   Looking  at  the  columns  labeled  Location,  we  see  that 
both  tests  are  fairly  robust  when  the  components  of  the 
bivariate  pairs  have  equal  scale  but  unequal  marginal 
locations.   From  the  last  three  columns  of  each  table,  we 
observe  that  slight  differences  in  the  locations  parameters 
do  not  affect  the  power  of  the  scale  statistics  appreciably 
but  as  the  location  differences  become  more  pronounced  the 
power  of  each  test  is  dramatically  reduced.   In  conclusion, 
if  slight  differences  exist  between  the  location  parameters 
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for  the  marginal  distributions,  the  tests  for  scale  still 
perform  appropriately  but  as  the  differences  increase  the 
tests  have  definite  drawbacks. 


APPENDIX  1 
TABLES  OF  CRITICAL  VALUES  FOR  TESTING  FOR  DIFFERENCES 
IN  SCALE 


The  tables  in  this  appendix  list  the  critical  values  of 


T       =  T    +  T 


(i.e.,  T       with  L,  =  L0  =  1)  for  .01,  .025,  .05  and  .10 
nl >nc         12 

levels  of  significance  for  n , =1 , 2  ,  .  .  .  ,  1  5  and 

n2  =  l  ,  2  ,  3  ,  .  .  .  10  .    These  tables  are  also  appropriate  for 

TM    „  ,  since  T„         TM„   _  .   For  larger  values  of  n,  or 
n1,nc'  nl»nc      nl'nc  l 

n  ,  the  asymptotic  normal  distribution  of  T       (and 

TM      )  could  be  used.   When  n,=  0  or  n  =  0,  the  critical 

values  can  be  obtained  from  the  critical  values  for  the 

Wilcoxon  signed  rank,  distribution  based  on  n,  or  n 

observations  (respectively). 

The  critical  values  for  this  test  statistic  were 

derived  for  each  n,  and  n   by  convoluting  two  Wilcoxon 

signed  rank  statistics  (based  on  n,  observations  and  n 

observations).   Thus,  it  follows  that  the  critical  values 

for  a  test  based  on  n,=  a  and  n  =  b  are  the  same  as  the 

1  c 


critical  values  for  a  test  based  on  n,=  b  and  n  =  a 


T 


observations.   Therefore,  the  tables  can  also  be  viewed  as 
listing  the  critical  values  for  n, = 1 , 2 , . . . , 1 0  and 
n  =1 , 2 ,  .  .  .  , 1 5 .   These  critical  values  are  tabled  for  the 
test  H  :   ai=°2  versus  H  :   ai<a2-   To  determine  the 
critical  values  for  the  alternatives  H„  :   ai>Oo  or 


l/u2 


Ha:   o^o2,     the  symmetry  of  T       about 

1  'nc 
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n, (n,+  1)  +  n  (n  +1 ) 
11         c   c 


could  be  used  to  calculate  the 


necessary  cutoffs. 

Since  the  Wilcoxon  signed  rank,  distribution  is 
discrete,  exact  .01,  .025,  .05  and  .10  level  critical  values 
do  not  always  exist.   These  tables  list  the  following: 

1)  The  critical  value  (c)  for  a  specific  a  level, 
such  that  P{Tn     >c}<a. 

2)  The  attained  significance  level  (p-value)  of 

each  critical  value  is  given  in  the  parentheses 

(i.e.,  P{T_     >c}  =  p-value). 
nl  'nc 

3)  The  attained  significance  level  of  the  next 

closest  critical  value  is  given  in  the  square 

brackets  (i.e.,  P{Tn     >(c-l)}). 

1  '  c 


For  example,  let  n,=  10  and  n  =  5,  the  critical  value 
for  a  .05  level  test  would  be  53.   The  attained  signficance 
level  for  the  test  would  actually  be  .048.   The  next  closest 
critical  value  would  be  53-1  =  52,  with  an  attained 
signficance  level  of  .059. 

When  n,  and  n   are  both  very  small,  (generally  less 
than  3),  many  times  a  critical  value  does  not  exist  for  a 
specific  level  of  significance.   Then,  the  value  in  the 
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n  (n  +1)  +  nc(nc+l) 

bracket  is  P(T„   „  >ra)  where  m  =  (i.e., 

nl'nc  ? 


the  largest  value  T       could  be). 
nl'nc 


161 


nc-l 


n,=l 


n1=2 


n1=3 


01 

— 

( 

)[. 

250] 

025 

-  — 

■( 

)[ 

250] 

05 

__. 

( 

)[ 

250] 

10 

-( 

)[ 
n1=4 

250] 

01 

-  — 

-( 

)[ 

031] 

025 

•( 

)[ 

031] 

05 

11 

( 

031)[ 

094] 

10 

10 

( 

094)[ 
n1  =  7 

.156] 

01 

29 

( 

.004)[ 

.012] 

025 

27 

( 

.020) [ 

.031] 

05 

25 

( 

.047)[ 

.066] 

10 

23 

( 

.094)  [ 
1^  =  10 

.129] 

01 

51 

( 

.008)[ 

.012] 

025 

48 

( 

.021)[ 

.028] 

05 

45 

( 

•047)[ 

.059] 

10 

42 

( 

.088) [ 

.106] 

01 

79 

( 

.009)  [ 

.012] 

025 

75 

( 

.022)  [ 

.026] 

05 

71 

( 

.044)  [ 

.051  ] 

10 

66 

( 

.090) [ 

.101] 

— (  M.125] 

—  (  )  [ .  125] 

(  M.125] 

— (  ) [ • 125] 


n1=5 


—  (  M.016] 
16  (.016M.047] 
15  (.047)1.078] 
14  (  .078)  [  .125] 


1 


35  (.010M.016] 

33  (.023M.033] 

31  (.047)1.064] 

29  (.086)  [  .111] 


n1-ll 


60  (.008) [ .011] 

56  (.024)[.030] 

53  (.046)1.056] 

49  (  .095)  [  .111  ] 


rij-14 


90  (  .009)  [  .011  ] 

85  (.023)1.027] 

80  (.049)1.056] 

75  (.092)1.103] 


—  (     M.063. 

(     M.063. 

(     M.063. 

7  (.063M.183 


n,  =6 


22  (.008)1.023 

21  (  .023) [  .039 

20  (.039M.063 

18  (.094)[.133 


nl  =  9 


43  (.008M.012 

40  (.023)1.032 

38  (.043)1.057. 

35  (  .092)  [ .  113 


nx  =  12 


69  (.009)1.012 

65  (.024M.029 

62  (.042)[.051 

57  (.095)1.109 


n1-15 


101(  .010)  [  .012 
96  (.022)[.026 
91  (.044)[.050 
84  (.099)1.109 


162 


nc  =  2 


01 
025 

05 
10 


n1=2 


— (     )[.063] 
— (     )[.063] 

(     )[.063] 

6   (.063)1.1881 


ni-3 


— (  )[.031] 
—  (  M.031] 
9  (.031)[.094] 
8   (.094)1.188] 


n1=4 


— (  )[.016] 
13  (.016)  [  .047] 
12  (.047)1.094] 
11  (.094)1.172] 


01 
025 
05 
.10 


m-s 

n,  =6 

n1=7 

18 

( 

.008)  [ 

.023] 

24 

( 

.004)  [ 

.012] 

30 

( 

.006)  [ 

.012] 

17 

( 

.023)  [ 

.047] 

22 

( 

.023)  [ 

.043] 

28 

( 

.021)[ 

.033  J 

16 

( 

.047)1 

.086] 

21 

( 

.043)[ 

.066] 

26 

( 

.049) [ 

.070] 

15 

( 

.086)  [ 

.133] 

19 

( 

.098)  [ 

.141] 

24 

( 

.098)  [ 

.131  ] 

01 
,025 

,05 

,  10 


T 


37  (.006)  [  .011  ] 

34  (,024)[.035] 

32  (.049)1.066] 

30  (.088) [ .113] 


n1=9 


44  (.008)1.012] 

41  (.024)[.033] 

39  (.044)[.058] 

36  (  .093)  [  .115] 


nj-10 


52  (  .009)  [  .012 

49  (  .022)  [ .029 

46  (.048M.060 

43  (  .090)  [  .  108 


n,-ll 


n,  =12 


1^  =  13 


.01  91  (.009)[.011] 

,025  86  (,023)[.027] 

,05  81  (.049)1.056] 

,10  76  (.091)1.103] 


01 

61 

( 

.008)  [ 

.011] 

70 

( 

.010)  [ 

.012] 

025 

57 

( 

.024)[ 

.031] 

66 

( 

.024)[ 

.029] 

05 

54 

( 

.047)1 

.057] 

63 

( 

.043)1 

.051] 

10 

50 

( 

.100)[ 
n1-l4 

.112] 

58 

( 

.096)[ 
n1-15 

.110] 

103 ( .008) [ .010] 
97  (,022)[.026] 
92  (.045)1.051] 
85  (  .099) [  .  110] 


80  (  .010)  [ .012 

76  (,022)[.027 

72  (.044)[.051 

67  (.090H.102 


163 


nc  =  3 


a 

n1=3 

n1=4 

n1=5 

01 

( 

)[ 

016] 

16 

( 

.008)  [ 

.023] 

21 

( 

.004)[ 

012] 

025 

12  ( 

•016)[ 

047] 

15 

( 

.023)  [ 

.047] 

19 

( 

023)  [ 

047] 

05 

11  ( 

.047)1 

094] 

14 

( 

.047)[ 

.094] 

18 

( 

•047)[ 

078] 

10 

10  ( 

.094)  [ 
n,  =6 

.188] 

13 

( 

.094)  [ 
n1=7 

.  156] 

17 

( 

,078)[ 
n,=8 

121  ] 

01 

26  ( 

.006)  [ 

.011] 

32 

( 

.006)  [ 

.012] 

38 

( 

.010) [ 

.015] 

025 

24  ( 

.023)  [ 

.039] 

30 

( 

.020) [ 

.030] 

36 

( 

.023)[ 

.032] 

05 

23  ( 

.039)  [ 

.060] 

28 

( 

.046)  [ 

.065] 

34 

( 

•045)[ 

.062] 

10 

21  ( 

.092)  [ 

.129] 

26 

( 

.090)  [ 
iij-10 

.120] 

32 

( 

.081)[ 
1^  =  11 

.104] 

01 

46  ( 

.008)  [ 

.011  ] 

54 

( 

.008)  [ 

.011  ] 

63 

( 

.008) [ 

.010] 

025 

43  ( 

.023)  [ 

.031] 

51 

( 

.021)  [ 

.027] 

59 

( 

.023)  [ 

.029] 

05 

41  ( 

.041)[ 

.054] 

48 

( 

.045)[ 

.056] 

56 

( 

.044)[ 

.053] 

10 

38  ( 

.086)  [ 

.107] 

45 

( 

.084) [ 
11,-13 

.  101  ] 

52 

( 

.090)  [ 
1^  =  14 

.  106] 

01 

72  ( 

.009) [ 

.011] 

82 

( 

.009)  [ 

.011] 

93 

( 

.009)  [ 

.011  ] 

025 

68  ( 

.023)  [ 

.028] 

78 

( 

.021)1 

.025] 

88 

( 

.022)[ 

.026] 

05 

64  ( 

.048)  [ 

.057] 

73 

( 

.049)  [ 

.056] 

83 

( 

.046)  [ 

.053] 

10 

60  ( 

.090)  [ 
nx  =  15 

.104] 

68 

( 

•097)[ 

.110] 

77 

( 

.098)  [ 

.110] 

01 

104( 

.009)  [ 

.011  ] 

025 

98  ( 

.025)[ 

.028] 

05 

93  ( 

.048)  [ 

.055] 

10 

87  ( 

.095)[ 

.105] 

164 


n    =  4 


a 

n1=4 

n1=5 

n,  =6 

01 

20 

( 

004)  [ 

012] 

24 

( 

.006)  [ 

012] 

29 

( 

006)  [ 

012] 

025 

18 

( 

.023)  [ 

047] 

22 

( 

.023)  [ 

041] 

27 

( 

021)  [ 

033] 

05 

17 

( 

.047)1 

.082] 

21 

( 

.041)[ 

066] 

26 

( 

.033)  [ 

052] 

10 

16 

( 

.082) [ 

m-7 

129] 

20 

( 

.067)1 
n1  =  8 

101] 

24 

( 

076)[ 
n1  =  9 

106] 

01 

35 

( 

.006)  [ 

.010] 

41 

( 

.008)  [ 

.013] 

48 

( 

.010)[ 

014] 

025 

33 

( 

.017)[ 

.026] 

39 

( 

.019)1 

.028] 

46 

( 

.019)1 

.026] 

05 

31 

( 

.0397[ 

.055] 

37 

( 

.038)  [ 

.052] 

43 

( 

.046) [ 

.059] 

10 

29 

( 

.075)1 

1^  =  10 

.101  ] 

34 

( 

.089)  [ 
rij-11 

.112] 

40 

( 

.092)  [ 
nx  =  12 

.113] 

01 

56 

( 

.010)[ 

.013] 

65 

( 

.009)  [ 

.012] 

74 

( 

.010)[ 

.013] 

025 

53 

( 

.023)  [ 

.030] 

61 

( 

.025)  [ 

.031] 

70 

( 

.024)[ 

.029] 

05 

50 

( 

.048) [ 

.060] 

58 

( 

.046)  [ 

.056] 

67 

( 

•042)[ 

.050] 

10 

47 

( 

.088)  [ 
nx  =  13 

.105] 

54 

( 

.094)  [ 
1^  =  14 

.109] 

62 

( 

.093) [ 
n1-15 

.  107] 

01 

84 

( 

.010)[ 

.012] 

95 

( 

.009)  [ 

.011] 

106( 

.010)[ 

.012] 

025 

80 

( 

.022)  [ 

.026] 

90 

( 

.023)[ 

.027  ] 

101( 

.022)  [ 

.025] 

05 

76 

( 

.043)[ 

.050] 

85 

( 

.048)  [ 

.055] 

95 

( 

.049)  [ 

.056] 

10 

7  0 

( 

.099) [ 

.112] 

8  0 

( 

.089)  [ 

.100] 

89 

( 

.096)  [ 

.  107  ] 

165 


nc  =  5 


a 

n1=5 

n  i  =6 

n1=7 

01 

28 

( 

006)  [ 

012] 

33 

( 

006)  [ 

010] 

38 

( 

009)  [ 

014] 

025 

26 

( 

021)[ 

034] 

31 

( 

017)  [ 

027] 

36 

( 

021)[ 

030] 

05 

25 

( 

.034)[ 

.054] 

29 

( 

041)[ 

059] 

34 

( 

043)1 

059] 

10 

23 

( 

079)[ 
nL  =  8 

112] 

27 

( 

083)  [ 
Bl-9 

112] 

32 

( 

079)[ 
nj-10 

103] 

01 

45 

( 

.007)[ 

.010] 

52 

( 

008)  [ 

.011  ] 

6  0 

( 

.008) [ 

.010] 

025 

42 

( 

.022)  [ 

.030] 

49 

( 

.021)[ 

.028] 

56 

( 

.024)[ 

.031  ] 

05 

40 

( 

.041)[ 

.054] 

46 

( 

.047)[ 

.059] 

53 

( 

.048)  [ 

.059] 

10 

37 

( 

.089)  [ 

Qj-11 

•  111] 

43 

( 

.091)[ 
iij-12 

.110] 

50 

( 

.086) [ 
nj-13 

.  103] 

01 

68 

( 

.009)  [ 

.012] 

78 

( 

.008)  [ 

.010] 

87 

( 

.010)[ 

.012] 

025 

64 

( 

.025)[ 

.031  ] 

73 

( 

.024)1 

.029] 

83 

( 

.022)  [ 

.026] 

05 

61 

( 

.046)  [ 

.055] 

69 

( 

.049)  [ 

.058] 

78 

( 

.049)  [ 

.057  ] 

10 

57 

( 

.091)[ 
tij-14 

.  106] 

65 

( 

.090)  I 
tij-15 

.103] 

73 

( 

.097)[ 

.109] 

01 

93 

( 

.009)[ 

.011] 

109( 

,010)[ 

.012] 

025 

93 

( 

.022)[ 

.026] 

103( 

.025)[ 

.029] 

05 

88 

( 

.047)[ 

.053] 

98 

( 

.048)  [ 

.054] 

10 

82 

( 

.097)1 

.108] 

92 

( 

.094)[ 

.104] 

166 


nc  =  6 


a 

n,  =6 

n1=7 

1^-8 

01 

37 

(. 

009)  [  . 

014] 

43 

( 

007)[ 

Oil  ] 

49 

( 

008)  [ 

012] 

025 

35 

( 

021)[. 

031] 

40 

( 

023)  [ 

032] 

46 

( 

023)  [ 

030] 

05 

33 

( 

044)[ 

061  ] 

38 

( 

044)  [ 

058] 

44 

( 

,040)[ 

052] 

10 

31 

( 

082)  [ 
n1  =  9 

108] 

35 

( 

097)[ 
nj-10 

122] 

41 

( 

084)[ 
rij-11 

104] 

01 

56 

( 

008)  [ 

Oil] 

64 

( 

.008) [ 

.011] 

72 

( 

.009)  [ 

.011  ] 

025 

53 

( 

021)[ 

027  ] 

60 

( 

.023)  [ 

.029] 

68 

( 

•024)[ 

.029] 

05 

50 

( 

.045)[ 

.056] 

57 

( 

•045)[ 

.055] 

65 

( 

,043)[ 

.051] 

10 

47 

( 

.084)  [ 
n1  =  12 

.102] 

53 

( 

.095)[ 
n1-13 

.111] 

60 

( 

.098)  [ 
n1-14 

.113] 

01 

81 

( 

.010)[ 

.012] 

91 

( 

.009)  [ 

.012] 

102( 

.009) [ 

.011  ] 

025 

77 

( 

.022)  [ 

.027] 

86 

( 

.024)[ 

.029] 

96 

( 

.024)[ 

.028] 

05 

73 

( 

.046)  [ 

.054] 

82 

( 

.046)  [ 

.053] 

91 

( 

.050)1 

.056] 

10 

68 

( 

.096)  [ 
nx  =  15 

.109] 

77 

( 

.090)  [ 

.  101] 

86 

( 

.090)  [ 

.  101] 

01 

1 13  ( 

.009)  [ 

.011] 

025 

107( 

.023)  [ 

.027] 

05 

102( 

.045)[ 

.050] 

10 

95 

( 

•097)[ 

.107] 

167 


n    =7 


a 

n1=7 

iij-8 

n1=9 

01 

48 

( 

008)  [ 

012] 

54 

( 

009)  [ 

012] 

61 

( 

008)  [ 

Oil  ] 

025 

45 

( 

.023)  [ 

031] 

51 

( 

022)  [ 

029] 

58 

( 

020)  [ 

025] 

0  5 

43 

( 

.042)[ 

054] 

48 

( 

.048)  [ 

060] 

54 

( 

050)[ 

061  ] 

10 

40 

( 

.087)[ 
nj-10 

108] 

45 

( 

.091)[ 
n1-ll 

111] 

51 

( 

.090)  [ 
n,-12 

107] 

01 

69 

( 

.008)  [ 

.010] 

77 

( 

.009)  [ 

.011] 

86 

( 

.009) [ 

.011  ] 

025 

65 

( 

•021)[ 

.026] 

73 

( 

.021)[ 

.026] 

81 

( 

.024)[ 

.029] 

05 

61 

( 

.049)  [ 

.059] 

69 

( 

.046)  [ 

.054] 

77 

( 

.048)  [ 

.055] 

10 

57 

( 

.098)  [ 
n1  =  13 

.114] 

64 

( 

.100)[ 
n1»14 

.115] 

72 

( 

.097) [ 
rij-15 

.110] 

01 

96 

( 

.009)  [ 

.010] 

106C 

.009) [ 

.011  ] 

1 17  ( 

.010)[ 

.011  ] 

025 

91 

( 

.022)  [ 

.025] 

101( 

.022) [ 

.025] 

UK 

.024)[ 

.027] 

05 

86 

( 

•047)[ 

.053] 

96 

( 

.044) [ 

.050] 

106  ( 

.045)[ 

.051  ] 

10 

81 

( 

.090)  [ 

.  101] 

90 

( 

.090) [ 

.  100] 

99 

( 

.096)[ 

.  106] 

168 


a 

1^=8 

n1  =  9 

01 

60 

( 

008)  [ 

Oil] 

67 

( 

008)  [ 

010] 

025 

57 

( 

.020)  [ 

026] 

6  3 

( 

.022)  [ 

028] 

05 

54 

( 

.042)[ 

.052] 

60 

( 

.043)[ 

052] 

10 

50 

( 

.093)  [ 
rij-11 

.111] 

56 

( 

.090)  [ 
n1  =  l  2 

106] 

01 

82 

( 

.010)[ 

.012] 

91 

( 

.010)[ 

.012] 

025 

78 

( 

.022)  [ 

.027] 

86 

( 

.025)[ 

.029] 

05 

74 

( 

.046) [ 

.054] 

82 

( 

.047  )  t 

.054] 

10 

69 

( 

.097)[ 
n1  =  l4 

.111] 

77 

( 

.093)  t 
a1-15 

.105] 

01 

lll( 

.010)[ 

.011] 

1  22  ( 

.010)[ 

.011  ] 

025 

105( 

•025)[ 

.029] 

1 16  ( 

.023)  [ 

.027  ] 

05 

100( 

.049)  [ 

.055] 

HOC 

.049)  [ 

.055] 

10 

94 

( 

.096)[ 

.106] 

104( 

.092)  [ 

.101  ] 

Hj-10 

74  (.009)[.012] 

70  (.023H.028] 

67  (,042)[.050] 

62  (.096)1 .111] 

1^  =  13 

10K.009)  [  .011] 
96  (  .022)  [  .025] 
91  (,046)[. 052] 
85  (.097H.108] 
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nc=9 


01 
025 
,05 

10 


01 
025 

05 
10 


01 
025 

05 
10 


n1=9 


n,  =10 


1^  =  11 


73 

( 

.009) [ 

.012] 

81  ( 

.008) [ 

.010] 

69 

( 

.024)[ 

.029] 

76  ( 

.024)[ 

.029] 

66 

( 

.043)[ 

.052] 

72  ( 

.049)  [ 

.057] 

61 

( 

•099)[ 
nL  =  12 

.115] 

68  ( 

.090)  [ 
1^  =  13 

.  104] 

97 

( 

.010)[ 

.012] 

107( 

.009)  [ 

.011] 

92 

( 

.024)[ 

.028] 

10  1  ( 

•024)[ 

.028] 

88 

( 

•045)[ 

.051] 

96  ( 

.049)  [ 

.056] 

82 

( 

.098)  [ 
nx  =  15 

.110] 

91  ( 

.090)  [ 

.  100] 

1 28 ( .009)  [  .011 
122  (  .022)  [  .025 
116(  .046)  [  .051 
109(  .093)  [  .102 


89  (.008)[.010] 

84  (  .022)  [  .027] 

80  (.044)1.051] 

75  (  .090)  [  .  103] 


n,  =  1  4 


1  17  (  .009)  [ .011] 
111 (  .024)  [  .027] 
106(  .045)  [  .051  ] 
99  (.098)1.108] 
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n  =10 


_2L 
01 

025 

,05 
10 


1^  =  10 

88  (.008)[.010] 

83  (.023)[.027] 

79  (.045M.053] 

74  (.093)1.106] 


96  (.008)[.010. 

91  ( .021  )  [ .025 

86  (.047)[.054. 

81  (  .092)  [  .  104 


1^  =  12 

104(  .010)  [  .012; 
99  (.023H.026. 
94  (.047)  [  .053 
88  (.098)[.109 


01 
,025 

,05 
,  10 


n1-13 


1 14 (  .009) [ .010] 
108(  .023)  [  .026] 
103(.045)  [ .051] 
96  (.099)[.110] 


n1  =  14 


124( .009) [ .010 
117 (  .025)  [  .028 
112(  .046)  [  .052 
105(  .097)  [  .  107 


n1  =  1  5 


135( .009) [ .010 
128 (  .023)  [  .026 
122(.046) [ .051 
1 1  5  C  .092)[  .101 


APPENDIX  2 
THE  MONTE  CARLO  PROGRAM 


The  Monte  Carlo  program  listed  in  this  appendix,  was  written 
for  this  research  using  fortran  (FORTXCG,  i.e.,  SYSTEM/370 
fortran  H  extended  (enhanced)).   Computing  was  done  utilizing  the 
facilities  of  the  Northeast  Regional  Data  Center  of  the  State 
University  System  of  Florida,  located  on  the  campus  of  the 
University  of  Florida  in  Gainesville.   It  used  available  IMSL 
subroutines  (e.g.,  GGUBS ,  GGNSM,  RANK,  ...,  etc.)  whenever 
possible.   The  single  precision  version  of  this  library  was  used. 
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DCOBLE    PRECISION    DSESD,DSEED2 

DIMENSION    POWERS  (45)  ,  X1  (  40)  ,  X2  (  40)  , SIGMA  (3)  ,  X(4Q,2)  , 
d)IRKV£C{40)  ,81fVEC(40)  ,WKVEC  (2)  ,  DFTST(3)  ,DFMST  (3)  ,C(40) 
SEAL    L1  (3)  ,L2{3) 

INTEGER    EEPS,N,NS,NNS,NPROb,  NCD,  NOCENS  (4  1,  4  1 )  ,0(40) 
C 

C    THIS    PEOGB&S   BUNS    A    MONTE    CAELO    FOR    A    SAMPLE    SIZE    UP    TO    40 
C 

C    OBTAIN    PARAMETERS    EOE    THIS    RUN    Of    TEE    MONTE    CARLO 
C 

CALL    I  NIT  (EEPS,N,NS,NNS,XMU,L1,L2,SIGMA,DS£ED,DSEFD2) 

DC    5    1=1, NNS 
5  POWERS  (I)  =0,  0 

NPEOB=0 

NCD=0 

C    NPR03    IS    THE    *    OF    SAMPLES    HITH    NO    UNCENSOEED    PAIRS,     WHILE 
C    NCD    IS    THE    #    OF    TIMES    CDTST    HAS    A    NEC    VARIANCE    ESTIMATE 
C 

DC    10    1=1,41 
DC    10    J=1,41 
10  NOCENS (I, J) =0 

C 

C    START    THE    REPLICATIONS 
C 

CO    100    IREPS=1,R£PS 
C 

C    GENERATE    AN    3    RANDOM    BIVAHIATS    NORMALS 
C    WITH    COVARIANCE    MAIRIXSIGMA 
C 

CALL    SAM?LE(DSSED,DSEED2,N,SIGMA,WKVEC,IHEPS,X,X1, 
M2,C) 
C 

C    NOW    TO    PREFORM    THE    CENSORING    ON    THE    RANDOM    VARIABLES 
C 

CALL    CENSOR (X  1 ,  X2,C,  N,D) 
C 

C    CALCULATE    THE   TEST    STATISTICS 
C 

CALL    DFSTAI  (X1 , X2, D, N, XMU, LI ,L2,NPR03, NCCE NS,DFTST, 
olDFMST) 

IF     (DFTST(1)     .EQ.     999.9)     GO    TO    100 
C 

CALL    CDS  TAT  (X  1 ,  X2,  D,  N,  CDTST,  CDTST2  ,CDTST3,  NCD) 
C 

C    COLLECT    SUMMARY    AND    POWER    STATISTICS 
C 

CALL    POWER (NS,NNS,DFTST, DFfcST, CDTST, CDTST2 ,CDTST3, 
3POWERS) 
100  CONTINUE 

C 
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C  PEIST  GDI  THE  RESULTS 
C 

WHITE (6,500)  NPSOB 
500    FOBS AT {'- '  ,'THE  NUMBER  OF  SAMPLES  DISCARDED,  DUB  ', 
£'TO  N1  =  0  WAS1,  1X,I8) 
SEITE  (6,502)  NCD 
502    FORMAT ('-', 'THE  NUMBER  OF  SAMPLES  RHICU  HAD  A  ', 
a>»  NEGATIVE  VARIANCE  ESTIMATE  FOR  CD  .if  AS  «,lXfl8) 
WEIT£(6,505) 
505    FORMAT  (•-•  ,'TtIE  DISTRIBUTION  OF  CENSORING:  THI2  SOWS', 
3«  ARE  FOR  TIPS  2  OR  3  AND  THE    COLUMNS  FOR  TYPE  U»  //) 
DO  510  1=1,41 
510      *RITS(6,515)  I,  (MOCESS  (I,  J)  ,  J=1,26) 
515      FORMAT('  •  ,12  ,  '  :  •  ,4X  ,  2614  ) 
IF  (N  .LT.  26)  GO  TO  530 
DO  520  1=1,41 
520      WRITS(6,525)  I,  (NOCENS  (I ,  G)  ,  J  =  27,41) 
525      FCRMAT(«  •  ,  12  ,  •  :  •  ,4  X,  2514) 
530    CONTINUE 
NPRCP=Q 
NX=N+1 

DO  600  1=1,  NX 
DC  601  J=1,KX 
bO  1    NPROP=NPBOP+  (((I-1)  +  (J-1))  *NOC£NS  (I,  J)) 
600    CONTINUE 

AVG=(FLOAT  (NPROP)  )/ (FLOAT  (  REPS)  ) 
KRITE(6,610)  AVG 
d10    FORMAT ('-', 'THE  AVERAGE  NUMBER  OF  OBSERVATIONS', 
a)'  CENSORED  IS:  ',F10-5) 
WHITE(6,550) 
550    FORMAT  ('-',  15X,  'FINAL  RESULTS',  /,  20X, 

a,         'STAT  *1:  DFIST1',  /  20X,'STAT  *2 :  DFTST2', 
&         /   20X,'STAT  #3:  DFTST3',  /  20X,'STAI  #4;  DFHST1*, 
a    /  20X,'STAT  #5:  DFMSI2',  /  20X,'STAT  *6:  DFKST3', 
i    /  20X,'STAT  #7:  CDTST  (WITHOUT  MEAN)', 
&         /    20X,»STAT  #3:  CDTST2  (WITH  MEAN)', 
a)    /  20X,'STAT  49:  CDTST3  (ASYKP  WITH  MEAN)*) 
C 

CALL  USWSH  (•  POWER  MAT  RIX/REJECTS  •  ,20  ,  POW  ERS  ,  NS  ,  1) 

STOP 

END 

SUBROUTINE  IN  IT  (REPS, N, NS, KNS, XMU, L1 , L2, SIGMA , DSE ED, 
(HD3EZD2) 
C 

C  THIS  SUBROUTINE  READS  THE  NUMBER  OF  REPLICATIONS  (REPS), 
C  THE  SAMPLE  SIZE  PER  RUN  (S )  ,  THE  POPULATION  COVASIANCE 
C  MATRIX  (SIGMA)  ,  AND  THE  VECTORS  LI  AND  L2  FOE  THE  STATISTIC 
C  DFTST.   IT  STORES  SIGMA  IN  SYMMETRIC  STORAGE  MCDE  (IXSL)  - 
C 

DOUBLE  PRECISION  DSEED,  DSEED2 

REAL  L1  (3)  ,L2(3) 

DIMENSION  SIGMA  (3) 

INTEGER  REPS 
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REPS=10G0 

N=25 

N3=9 

N»S=(NS*(NS  +  1)  )/2 

L 1  ( 1 )  =  1  .  0 

L2(1)=  1-0 

LI  (2)  =2.0 

L2(2)  =  1.0 

L1  (3)  =  1.0 

L2(3)=2.  0 

DSEED=335768.D0 

DSEED2=672344. DO 
C 

SHITE  (6,  100)  REPS,  K 
100    FORMAT  (•-•  ,15X,'  ft  HEPS  =  ',  14,  10X ,  '3  AMPLE  SIZE  (N)  =  • 
di!2) 
C 

C  NOW  TO  READ  IN  THE  COMMON  LOCATION  PARAMETER  (XMC) 
C 

XKU=0.0 

c 

C    NOW    TO    ENTER    THE    COVARIAHCE    MATRIX     (SIGMA) 
C 

SHC=. 2 

VARX1=1 ,0 

V  A  2X2=  3.  0 

SIGMA  (1)=VASX1 

SIGHA(2)=8H0*(SQ2T(VARX1)  )  *  (SQBT  (VAEX2)  ) 

SIGMA  {3)=VARX2 
0 

C  ECHO    CHECK 

C 

MRITE(6,  105) 
105       FOSKAT(«0'//    10X, « BIVARIATE    N03MAL   DISTRIBUTION    ', 
^'GENERATED') 

CALL    OSBSH(»COV.    MATRIX    SIGMA1  ,  17,  SIGMA,  2,  2) 

RETURN 

END 

SUBROUTINE    SAMPLE (DSEED,  DSEBD2 , H,S IGMS,iKV EC, IREPS, 
a)X,X1,X2,C) 

r 

C  THIS  SUBROUTINE  GENERATES  THE  NX2  RANDOM  VECTOR  OF 

C  OBSERVATIONS 

C 

D0U3LS    PRECISION    DSEED, DSEED2 

DIMENSION    X(N,2)  ,WKVEC{2)  ,SIGMA  (3)  ,  X1  (N)  ,X2  (N)  , 
diU  (40)  ,C  (N) 

INTEGER    N,IEH 
C 

C    CALL    THE    IMSL    NORMAL    RANDOM    VECTOR    SUBROUTINE 
C 

m'KV£C(1)  =  1.  0 

IF     (IS  EPS    ,EQ.     1)     WK?EC(1)=0.0 
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CALL    GGNSH(DSEED,N,2,SIGMA,H,X,WKVEC,I£R) 
IF    (IER    .NE.    0)       1RITE{6,500)     I£EPS#IE2 
500  FORMAT (■-', 10X, 'GGNSH    ERROR,    ESPL IC AI  ICK= • ,  14, 

(£,IEfi=,,Ii*) 

DO   25    1=1, N 
X1{I)=X(I,1) 
X2(I)=X(I,2) 
25  CCNTINOE 

C 

C    SOU    TO    GENERATE   THE    CENSORING    DISTRIBUTION 
C 

C    FIRST    GENERATE    A    SAMPLE    OF    N    UNIFORM  (0 ,  1 )     [-;  V  •  S 
C 

CALL    GGUBS(DSEED2,N,U) 
C 

C    NOB    TO    GENERATE    THE    CENSORING    EANDOII    VAFIAELES 
C 

DO    28    1=1, N 
^8  C(I)  =ALGG  (o,9975*n  (I)) 

C28  C(I)=ALCG(6.8371*U  (I)  ) 

C28  C(I)=ALOG(6, 3369*0  (I)  ) 

C 

RETURN 
END 

SUBROUTINE  SANK  (HO  ,Z  ,  HZ,  IEi  VEC  ,  SHV  EC) 
C 

C  THIS  SUBROUTINE  CALCULATES  THE  VECTOR  RANKS 
C 

DIMENSION  Z(NU)  ,RZ  (NU)  ,IRWVEC(NO)  , RWVEC  (NU) 

EFS=0. 00000001 

CALL  NHRANK(Z,NU,EPS,IRMVEC,R«VEC,RZ,S2,S3) 

RETURN 

END 

SUBROUTINE  CEfiSOH  (X1  ,X2,C  ,  N,  D) 
C 

C  THIS  SUBROUTINE  CENSORS  THE  DATA  AND  CREATES  A  VECTOR  D  OF 
C  THE  TYPE  OF  CENSORED  PAIR,  A  PARTICULAR  PAIR  15, 
C 

DIMENSION  X1  (N)  ,X2  (?I)  ,C  (N) 
INTEGER  D(N) 
C 

DO  6  1=1, N 

IF  (X1(I)  ,NE.  X2  (I)  )  GO  TO  100 
IF  (X2{I)  .LE.  C(I))   D(I)=1 
IF  (X2(I)  .GT,  C(I))   GO  TO  102 
GO  TO  6 
100     IF  (X2(IJ  ,  LE.  X1(I))  GO  TO  4 
IF  (X2(I)  .GT.  C(IJ)  GO  TO  105 
D(I)=1 
GO  TO  6 
105     IF  (X1(I)  .LE,  C(I))  GO  TO  110 
102       D(I)=4 
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X1(I)=C(I) 

X2(I)=C(I) 

GO  TO  6 
1  1C     D(I)=2 

X2(I)=C(I) 
GO  TO  6 
4       IE  (X2(I)  .LE.  C(I))  GO  TO  115 

D(I)=4 

X1(I)=C(I) 

X2(I)=C(I) 

GO  TO  6 
115     IF  (X1(I)  ,LE.  C(I))  GO  TO  120 

D(I)=3 

X1(I)=C(I) 

GO    TO    6 
120  D(I)  =  1 

6  CONTINUE 

I5ETUBN 
END 


SUBROUTINE  ESTHU  (X1 ,  X2,B  ,K,EMU) 

C 

C  TUIS  SUBROUTINE  CALCULATES  THE  COMBINED  SAMPLE  MEDIAN 
C  USING  THE  KAPLAN  MEIES  ESTIMATOR  KITE  SMOOTHING  (E3U) 
C  AND  WITHOUT  SMOOTHING  (SMU)  . 
C 

DIMENSION  X1  (N)  ,X2{N)  ,BVEC(80)  , BY ( bO) , Y  (  80)  ,11(80)  , 
dIiVEC(30)  rS(80) 

INTEGER  D(N)  ,DD(S0)  ,DYY(80) 

K=1 

DO  10  1=1,  N 

IF  (D(I)-3)  20,20,11 
20        J=2*K 

JJ=J-1 

Y(J)=X2(I) 

Y(JJ)=X1  (I) 

IF  (D(I)-2)  22,24,25 
22  DD(J)=1 

DD(J-1)=1 

GO  TO  28 
24        DD(J)=0 

DD(J-1)=1 

GO  TO  28 
26        DD(J)  =  1 

DD(J-1)=0 

GO  TO  28 
11        J=2*K 

JJ=J-1 

Y(J)=X2(I)  +0.  000000  1 

Y(JJ)=X1(I) 

£D(J)=0 

DD(JJ)=0 
28        K=K+1 
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10  CONTINUE 

C 

CALL    HANK  (J,  Y,E  Y,IWV£C,  SVEC) 
DC    780    1=1, J 
L=IPIX  (aY(I)  ) 
YY(L)=Y(I) 
7&0  CYY(L)=DD(I) 

C 

C    COMPUTING   TOE    KAPLAN    MEIES     ESTIMATE    CF    THE    SURVIVAL 
C    FUNCTION 
C 

XJ=FLOAT  (J) 

S  (1)  =  ((XJ-1.  0)/XJ)  **DYY{1) 
DC    500    1=2, JJ 
XI=FLGAT(I) 
500  S(I)=S  (1-1)  *((  (XJ-XI)/(XJ-XI+1.0)  )  **DYY(I)  ) 

S  (J)  =0.0 
C 

C    NOa    TO    CALCULATE    THE    BSD  I  AH    ESTIMATES 
C 

DO    550    1=1, J 

IF  (S  (I) -.500000)  530,540,550 
540       EMU  =  YY(I) 
GO  TO  560 
b30       E1  =  YY(I) 

E2=YY(I-1) 

EMU  =  E1-((E1-E2)*(.5  000  0G-S(I))/(S  (1-1) -S  (I) ) ) 
SMU  =  (E1+E2)/2.0 
30  TO  560 
550     CONTINUE 
560     CONTINUE 
C 

EETUEN 
END 

S  UB  POUT I N£    DFS  TAT ( X1  ,  X2  ,  D ,  N , XS U , L 1 , L 2 , N? B CB, NOC ENS , 
oBDFTST,DFMST) 
C 

C    THIS    SUBROUTINE   COMPUTES    THE    TEST    STATISTIC    CALLED    DFTST 
C     (DFMST),     WHICH    IS    THE    COMBINATION    OF    TWO    SIGNED    SANK 
C    STATISTICS       DFTST  (I)     =    L  1  ( I)  *i*ILCG  XON    +    L2  (I)  *  HI  LCOXCN 
C 

SEAL   L1  (3)  ,L2(3) 

DIMENSION  XI  (N)  ,X2  (N)  ,Z(40)  ,T2  3  (40)  ,  PHI  (40 )  ,  BH  VEC  (40)  , 
aDFHST(3)  , DFTST  (3)  ,3Z(40)  ,T(3)  ,VAET(3)  ,SDT(3)  ,RT23(40)  , 
d)KZZ(40)  ,  IE »  VEC  (4  0)  , GAS  (40)  ,  ZZ  (40)  ,ZPHI  (40)  ,TT  (3)  ,  E  I(  3) 
INTEGER  D(N)  , NCCSNS ( 41 , 4  1 ) 
81=0 
NC=0 
C 

C  NCa  TO  ESTIMATE  THE  VALUE  OF  MO  (E»U)  USING  THE  PEODUCT- 
C  LIMIT  ESTIMATOB  BASED  ON  THE  ENTIEE  SAMPLE 
C 

CALL  ESTMU(X1,X2,D,N,EMU) 
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C    IKS    FIRST    PART    OF    THE    SUBROUTINE,    UTILIZES    WHAT    TYPE   OF 
C    PAIR     (X1,X2)     IS:       TYPE    1  =    UNCEKSGR£D,TYPE    2=    X2   CEHS03ED, 
C  TYPE    3=    XI    CENSORED, TYPE    4=CCTH    CENSORED 

C    AND    PLACES    THE    UNCEN3CE3D    CALCULATION'S    IN    T1     ,    WHILE    TEE 
C    TYPE    2    OR    3    CENSORED    C    VALUES    GO    INTO    THE    VECTOR    T23 
C 

DO    6    1=1 ,N 

IF     (D(I)     .  EQ.     4)     GO    TO    6 

IF     (D(I)-2)     5,3,4 
C 

C  A  PAIR  HILL  GO  TO  3,  IF  IT  IS  A  TYPE  2  PAIR 
C 

3  NC=NC+1 

GAM  (NC)  =1.0 

T23  (HC)=X2(I) 

GO  TO  6 
C 

C  A  PAIR  WILL  GO  TO  4,  IF  IT  IS  A  TYPE  3  PAIS 
C 

4  NC=NC+1 
GA«(NC)=0, 0 
T23  (NC)=X1  (I) 
GO  TO  6 

C 

C    A    PAIR    WILL    GO    TO    5,     I?    IT    IS    A    TYPE     1    PAIS 
C 
o  N1=  N1  +  1 

Z(H1J=(A3S  ((X2(I))  -iaO)  )  -(ADS  (  (X  1  (I)  )-Xfi£J)  ) 

ZZ(N1)  =  (AB3((X2(I)  )-EMU))-(ABS((Xl  (I)) -EMU)) 

PHI  (N1)  =  (SIGN  (1.0,Z  (Nl))/2.0)  +0.  5 

ZPHI(NI)  =  (SIGN  (1.0,  ZZ(Nl))/2.0)+0,  5 

Z(N1)  =  ABS  IZ(N1)) 

ZZ(N1) =AES  (ZZ(N1)) 
6  CONTINUE 


IF     (N1    .EQ.     0)     GO    TO    100 
C 

C    TO    INSERT    WHAT    TYPE    OF    CENSORING    OCCURRED    INTO    THE    MATRIX 
C    NOCENS     (NN4=#TYPE    4    PAIRS    +     1,     NNC=#TYPE    2    OR    3    PAIRS    +     1) 

NN4=1+N-N1-NC 

NNC=NC+1 

NOCENS (NNC,NN4)= NOCENS (NNC,NN4) +1 
C 

C  CALCULATING  THE  ABSOLUTE  RANKS  FOR  THE  S1  UNCSNSCRED  003. 
C 

CALL  RANK(Nl,Z,RZ,IRWVEC,EfcV£C) 

CALL  RANK  (N 1 , ZZ , 3ZZ, IR * VEC , RWVEC) 
C 

C  NOW  TO  CALCULATE  THE  RANKS  OF  THE  C'S  FOR  TOE  TYPE  2  AND  3 
C 

IF  (NC  ,NE,  0)  GO  TO  25 
WRITE  (6,23)  NC 
2  3      FORMAT (•-«,•  THERE  ARE  NO  TYPE  2  OB  3  CENSORED  •, 
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S'OESEHVATIOSS,    NC  =    • ,13) 

GC    TO    23 
25         CONTINUE 

CALL    RANK  (NC,  T23,BT23,  IRWVEC,BIVEC) 
C 

28         CONTINUE 
C 

C  NOW  TO  CALCULATE  THE  WILCOXON  TYPE  STATISTICS,  SDMI  AND 
C  SUHC,  AND  THE  CORRESPONDING  EXPECTED  VALUES  AND  VARIANCES 
C 

SUM1=0, 0 
S0H1  1=0.  0 
DO  30  1*1, Ml 

SUK1  =  SUM1  +  (PHI(I)*EZ  (I)) 

sumii=sumiu  (zphi  (i)  *bzz(i)) 

30         CONTINUE 

VAR  1  =  (FLOAT (N1*  (N1  +  1 )  *  (  (2*S1)  +  1}  )  )/24.  C 
E1  =  (FLOAT  (N1*(N1  +  1))  )/4.0 
C 

SUMC=Q, 0 

IF     (NC     ,NE»     0)     GC    TO   3  3 
VAHC=0.0 
EC=0.0 
.    GC    TO    35 
i^         CONTINUE 

DO    34    1=1,  NC 

SUMC=SUMC+  (GAM  (I)  *BT23  (I)  ) 

34  CONTINUE 

VASC= (FLOAT (NC*  (NC  +  1 )  *  (  (2*NC)  +  1)  )  )  /2  4  ,  0 
EC=  (FLOAT  (NC*(NC+1)))/4.0 
C 

35  CGKTINUE 
C 

C  NOW  TO  CALCULATE  THE  DFTST 
C 

DO  38  1=1,3 

T(I)  =  (L1  (I)*SUM1)  +  (L2  (I)  *SUMC) 

TT  (I)  =  (L1  (I)*SUM11)  +  (L2(I)  *SUMC) 

ET(I)  =  (L1  (I)*E1)  +  (L2  (I)*EC) 

VAET(I)=  ((L1  (I)  **2)*VAR1)  +  ( (L2  (I)  *  *2)  *¥AEC) 

SDT  (I)=SQRT(VAST(I)) 

EFTST(I)  =  (T(I)-ET(I)  ) /SDT  (I) 

DFMST  (I)  =  (TT(I)  -ET(I)  )/SDT(I) 
3e    CONTINUE 

GO  TO  47 
100   CONTINUE 


C 


C    THERE    IS    A    PROBLEM,     N1=0,     THUS    THE    SAMPLE     IS     NOT    GOING    TO 
C    BE    USED    IN    THE    POSER    STUDY.        THE    TEST    STATISTIC    KILL    5E 
C    SET    TO    999.9    WHICH    WILL    BE    USED    AS    AN    INDICATGE. 


C 


DO    102    1=1,3 

DFMST  (I)  =999.9 


180 


102     DFTST  (I) =999.9 

KEITE(6,104) 
104   FOBHAT(,_l,  «***  P20BL2M,  B1  =  0,  THE  SAMPLE  KILL  DOT  •, 
3»BE  USED  IS  THE  POWER  STUDY1) 

NPfiGE=N?ROB+1 


47 


8ETUEN 

END 


SUBROUTINE  CDSTAT  {X1  ,X2,  D,  N, CDTST,  CDTST2  ,  CDTST3,  NCR) 
C 

c 

C  THIS  SUBROUTINE  CALCULATES  THE  CCNCOREANT  -  'JI5CC3DANT 

C  TYPE  STATISTIC  (CDTST) . 

C 

DIMENSION    X1  (N)  ,X2(N)  ,  Y1  (40)  , Y2{40) 

INTEGER    D(N)  ,A{4O,40),B  (40,40)  ,  SO  BCD,  S  U3  A  ,  SUM  3,  SUM  1 , 
a>SUK2,SUa3,SUMGG 
DO     1    1=1,  N 
DC    2    J=1,N 
A(I,J)=Q 
2  B(I,J)=0 

1  CONTINUE 

DO    5    1=1,  a 

Y1  (I)  =X1  (I)  +X2  (I) 
5  Y2  (I)=X1  (I)-X2(I) 


C    aoa    TO    CALCULATE    THE    A  (I, J)     AND    B(I,J)     MATRICES 
C 

KN=N-1 

DO    10    1=1,  N'N 
11=1+1 
DO    20    J=II,N 

IF    (Y1  (IJ-Y1  (J))     200,230,220 
IF     (D(I)     ,EQ,     1)     A(I,J)  =  1 
GO    TO    230 

IF     (D(J)     ,EQ,     1)     A(I,J)=-1 
IF     (D(I)     .  EQ.     4)     GO    TO     10 
IF     (D(J)     »EQ.    4)     GO    TO    20 
IF     (Y2  (I)-Y2(J))     240,20,260 
IF    (D(I)-3)     242,20,20 
IF     (D(J)     .20..     1)     B(I,J)=1 
IF     (D(J)     ,£Q.    3)     B(I,J)=1 
GO    TO    20 

IF    (D(J)-3)     262,20,20 
IF     (D{I)     .EQ.     1)     B(I,J)=-1 
IF     (D(I)     .EQ,     3)     3(I,J)=-1 
20  CONTINUE 

10  CONTINUE 


200 

220 
230 


240 
242 


2b0 

2o2 


C    CALCULATING    THE    CONCOSDANI-DISCOBDANT    STATISTIC     (CDTST) 
C 

SUMCD=0 

SU«A=0 
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SOMG=0 
S0H1=0 
SOM2=0 
SUH3=0 
SUHGG=0 
I N = 0 
iilv=N—  1 

DO    40    1=1, BH 
11=1+1 
DO    42    J=II,N 

SOKCD=SUMCD+(A(IrJ)*B (I,J) ) 

SUHA  =  SUMA+IABS(A  (I ,  J)  *E  (I ,  J)  ) 

IF  (J  .  EQ.  li)    GO  TO  45 

JJ=J+1 

DO  45  K=JJ,N 

SUM  1  =  SUM  1+  (A  (I,  J)  *A(I,K)  *5  (I,  J)*5  (I,  K)  ) 
SUM2=SUK2+(A(I,K)*A  (J,K)*3  (I,K)  *B(J,K)  ) 
SOH3=SOH3*  (A  (I,  J)  *&  { 3,  K)  *B  {I,  J)  *li{ J,  K)  ) 
IN=IN+3 
45  CONTINUE 

42        CONTINUE 
40      CONTINUE 

VN  =  FLOAT(N) 
VNN=VN* (VN- 1,000) 
CE=2. 000* (FLOAT (SUMCD) )/VKN 
SUMG=SUaUSUM2  +  SUM3 
S[JMGG=SUM1+S'JM2+SUr'3+SUMA 
AAA=(2.000*  (FLOAT(SUMA))/VNS)-  (CD*CD) 
VIu=FLGAT(IN) 
VNNN=FLOAT(N*  (  (N-1)**2)) 
G=  (FLOAT  (SUaG)  )  /VIN 

GGG=(  (2.000*  (FLOAT  (SUHGG)  j  )/VN8N)-  (CD*CD) 
VA5CD=(4.000*G)/VN 

VAECD2=(  (2,0Q0*AAA)«-  (4,00  0  *(VN-2,  0  0  0)  *GGG)  )/VNN 
VABCD3= (4. 000*GGG) /VN 
IF  (VA2CD2  .61.  0,0)  GO  TO  55 
CDIST=0. 00 
CDTST2=0.00 
CDTST3=0,00 
NCD=NCD+1 
GO  TO  65 
55      SDCD=SQET(VAKCD) 

SDCD2=SQET(VARCD2) 
SDCD3=SQET(VARCD3) 
CDTST=(CD)/  (SDCD) 
CDTST2=  (CD)/(SDCD2) 
CDTST3=(CD)/(SDCD3) 
6  5      CONTINUE 
EE1DKN 
END 

S OBROUTI HE  PO »Efi (NS, N US, DFT3T , DFHS T , CUTS! , CDTST2 , 
SCDTST3, POKERS) 
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NOTE: 


DIMENSION    POWERS  (NN3)  , REJECT  (9)  ,DFT3T  (3)  ,DFM3I  (3) 

NS=NUiiBEB    OF    STATISTICS    CALCULATED 
SSS=    NS(NS+1)/2 


GIVE    THE    CRITICAL    VALUES    FOR    THE    TEST    STAIISIICS 

ZCEIT=1.645 
BZCRIT=-1.b45 


GE, 

ZCRIT) 

REJECT  (1) 

=  1 

0 

GE. 

ZCRIT) 

REJECT  (2) 

=  1. 

0 

GE, 

ZCRIT) 

REJECT  (3) 

=  1, 

0 

GE  . 

ZCRIT) 

REJECT  (U) 

=  1. 

Q 

GE, 

ZCRIT) 

REJECT  (5) 

=  1, 

0 

GE. 

ZCRIT) 

REJECT  (£) 

=  1. 

0 

c 

C  CALCULATE  THE  POWERS 
C 

DC  10  1=1, NS 
10        BEJECT  (I)=0.0 
IF  (DFTST(I) 
IF  (DFTST(2) 
IF  (DFTST(3) 
IF  (DFMST(I) 
IF  (DFHST(2) 
IF  (DFHST(3) 
IF  (CDTST  ,  LE»  DZCSIT)  REJECT  (7)  =1  ,  0 
IF  (CDTST2  .LE.  RZCRIT)  HE J3CT  (8)  =  1 .  0 
IF  (CDTST3  ,  LS,  BZCSIT) 
DC  20  J=1,NS 
JJ=Jf 1 

K=  (J*(J-1)/2+J) 
POWERS  (K)  =POWERS  (K)  + 
IF  (JJ  .  GT.  NS)  GO  TO 
DO  20  I=JJ,NS 
K=  (I*  (I-1)/2+J) 
POWERS  (K)  =  POW2HS(K) 
20        CONTINUE 


REJECT  (9)  =  1,0 


REJECT  (J) 
20 


+  REJECT  (I)  *R^JECT  (J) 


RETURN 
END 
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